辅导讲解extra程序、辅导讲解StarterC++、辅导java
- 首页 >> Java编程Assignment 3B: Grammar Solver
Assignment by Marty Stepp and Victoria Kirst. Based on a problem by Stuart Reges of U of
This problem focuses on recursion.
This is a pair assignment. You are allowed to work individually or work with a single partner. If you
work as a pair, comment both members' names on top of every submitted code file. Only one of
you should submit the assignment; do not turn in two copies.
Links:
Starter Code
We provide a ZIP archive with a starter project that you should download and open with Qt Creator.
You will edit and turn in only the following file. The ZIP contains other files/libraries; do not modify
these. Your code must work with the other files unmodified. If you want to declare function
prototypes, declare them at the top of your .cpp file, not by modifying our provided .h file.
grammarsolver.cpp, the C++ code for your solution
mygrammar.txt, your custom grammar input file. If you work in a pair, you should submit this
file individually, labelling the two files mygrammar_SUNET1.txt and
mygrammar_SUNET2.txt.
demo JAR
If you want to further verify the expected behavior of your program, you can download the following
provided sample solution demo JAR and run it. If the behavior of our demo in any way conflicts with
the information in this spec, you should favor the spec over the demo.
"How do I run the assignment solution demos?"
If you want to further verify the expected behavior of your program, you can download the provided
sample solution demo JAR and run it. If the behavior of our demo in any way conflicts with the
information in this spec, you should favor the spec over the demo.
Our assignments ofer a solution 'demo' that you can run to see the program's expected behavior.
On many machines, all you have to do is download the .jar file, then double-click it to run it. But on
some Macs, it won't work; your operating system will say that it doesn't know how to launch the file.
If you have that issue, download the file, go to the Downloads folder in your Finder, right-click on the
file, and click Open, and then press Confirm.
Some Mac users see a security error such as, "cs106b-hw2-wordladder-demo.jar can't be opened
because it is from an unidentified developer." To fix this, go to System Preferences → Security &
Privacy. You will see a section about downloaded apps. You should see a prompt asking if you would
like to allow access to our solution JAR. Follow the steps and then you should be able to open the
demo.
If all else fails, you could run the demo JAR from a terminal. Every operating system allows you to
open a "terminal" or "console" window for typing raw system commands. Open your operating
system's terminal or console window (Google if you need to learn how to do this), and then type:
cd DIRECTORY_WHERE_DEMO_JAR_IS_STORED
java -jar JAR_FILE_NAME
For example, on a Mac machine, if your user name is jsmith12 and you have saved a demo JAR
named hw2.jar in your Documents/106b directory, you would type:
cd /users/jsmith12/Documents/106b
java -jar hw2.jar
Or on a Windows machine, if your user name is jsmith12 and you have saved a demo JAR named
hw2.jar in your Documents/106b directory, you would type:
cd C:\users\jsmith12\Documents\106b
java -jar hw2.jar
Problem Description:
For this part you will write a function for generating random sentences from a grammar.
Vector<string> grammarGenerate(istream& input, string symbol, int times)
Your function accepts a reference to an input file representing a language grammar, along with a
symbol to randomly generate and the number of times to generate it. Your function should generate
the given number of random expansions of the given symbol and return them as a Vector of strings.
A formal language is a set of words and/or symbols along with a set of rules, collectively called the
syntax of the language, defining how those symbols may be used together. A grammar is a way of
describing the syntax and symbols of a formal language. Even programming languages have
grammars; here is a link to a formal grammar for the Java language.
Many language grammars are described in a format called Backus-Naur Form (BNF), which is what
we'll use on this assignment. In BNF, some symbols in a grammar are called terminals because they
represent fundamental words of the language. A terminal in English might be "boy" or "run" or
"Mariana". Other symbols of the grammar are called non-terminals and represent high-level parts of
the language syntax, such as a noun phrase or a sentence. Every non-terminal consists of one or
more terminals; for example, the verb phrase "throw a ball" consists of three terminal words.
The BNF description of a language consists of a set of derivation rules, where each rule names a
symbol and the legal transformations that can be performed between that symbol and other
constructs in the language. For example, a BNF grammar for the English language might state that a
sentence consists of a noun phrase and a verb phrase, and that a noun phrase can consist of an
adjective followed by a noun or just a noun. Rules can be described recursively (in terms of
themselves). For example, a noun phrase might consist of an adjective followed by another noun
phrase, such as the phrase "big green tree" which consists of the adjective "big" followed by the
noun phrase "green tree".
A BNF grammar is specified as an input file containing one or more rules, each on its own line, of the
form:
non-terminal ::= rule|rule|rule|...|rule
A separator of ::= (colon colon equals) divides the non-terminal from its expansion rules. There will
be exactly one such ::= separator per line. A | (pipe) separates each rule; if there is only one rule for
a given non-terminal, there will be no pipe characters. Each rule will consist of one or more
whitespace-separated tokens.
The following is a valid example BNF input file describing a small subset of the English language.
Non-terminal names such as <s>, <np> and <tv> are short for linguistic elements such as sentences,
noun phrases, and transitive verbs.
<s>::=<np> <vp>
<np>::=<dp> <adjp> <n>|<pn>
<dp>::=the|a
<adjp>::=<adj>|<adj> <adjp>
<adj>::=big|fat|green|wonderful|faulty|subliminal|pretentious
<n>::=dog|cat|man|university|father|mother|child|television
<pn>::=John|Jane|Sally|Spot|Fred|Elmo
<vp>::=<tv> <np>|<iv>
<tv>::=hit|honored|kissed|helped
<iv>::=died|collapsed|laughed|wept
Sample input file sentence.txt
This grammar's language can represent sentences such as "The fat university laughed" and "Elmo
kissed a green pretentious television". This grammar cannot describe the sentence "Stuart kissed
the teacher" because the words "Stuart" and "teacher" are not part of the grammar. It also cannot
describe "fat John collapsed Spot" because there are no rules that permit an adjective before the
proper noun "John", nor an object afer intransitive verb "collapsed".
Though the non-terminals in the previous example language are surrounded by < >, this is not
required. By definition any token that ever appears on the lef side of the ::= of any line is
considered a non-terminal, and any token that appears only on the right-hand side of ::= in any
line(s) is considered a terminal. Do not assume that non-terminals will be surrounded by < > in your
code. Each line's non-terminal will be a non-empty string that does not contain any whitespace. You
may assume that individual tokens in a rule are separated by a single space, and that there will be no
outer whitespace surrounding a given rule or token.
Your grammarGenerate function will perform two major tasks:
1. read an input file with a grammar in Backus-Naur Form and turns its contents into a data
structure; and
2. randomly generate elements of the grammar (recursively)
You may want to separate these steps into one or more helper function(s), each of which performs
one step. It is important to segregate the recursive part of the algorithm away from the non-recursive
part.
You are given a client program that does the user interaction. The main function supplies you with
an input file stream to read the BNF file. Your code must read in the file's contents and break each
line into its symbols and rules so that it can generate random elements of the grammar as output.
When you generate random elements, you store them into a Vector which is returned. The provided
main program loops over the vector and prints the elements stored inside it.
Example Logs of Execution:
Your program should exactly reproduce the format and general behavior demonstrated in this log,
although you may not exactly recreate this scenario because of the randomness that your code
performs. (Don't forget to use the course File → Compare Output... feature in the console, or the
course web site's Output Comparison Tool, to check output for various test cases.)
Grammar file name? sentence.txt
Symbol to generate (Enter to quit)? <dp>
How many to generate? 3
1: the
2: the
3: a
Symbol to generate (Enter to quit)? <np>
How many to generate? 5
1: a wonderful father
2: the faulty man
3: Spot
4: the subliminal university
5: Sally
Symbol to generate (Enter to quit)? <s>
How many to generate? 7
1: a green green big dog honored Fred
2: the big child collapsed
3: a subliminal dog kissed the subliminal television
4: Fred died
5: the pretentious fat subliminal mother wept
6: Elmo honored a faulty television
7: Elmo honored Elmo
Expected output: Here are some additional expected output files to compare. It's hard to match the
expected output exactly because it is random. But your function should return valid random results
as per the grammar that was given to it. Your program's graphical Console window has a File →
Compare Output feature for checking your output.
test #1 (sentence.txt)
test #2 (sentence2.txt)
test #3 (expression.txt)
We have also written a Grammar Verifier web tool where you can paste in the randomly generated
sentences and phrases from your program, and our page will do its best to validate that they are
legal phrases that could have come from the original grammar file. This isn't a perfect test, but it is
useful for finding some kinds of mistakes and bugs.
Now that you've seen an example of the program's behavior, let's dive into the implementation
details of your algorithm.
Part 1: Reading the Input File
For this program you must store the contents of the grammar into a Map. As you know, maps keep
track of key/value pairs, where each key is associated with a particular value. In our case, we want to
store information about each non-terminal symbol. So the non-terminal symbols become keys and
their rules become values. Other than the Map requirement, you are allowed to use whatever
constructs you need from the Stanford C++ libraries. You don't need to use recursion on this part of
the algorithm; just loop over the file as needed to process its contents.
One problem you will have to deal with early in this program is breaking strings into various parts. To
make it easier for you, the Stanford library's "strlib.h" library provides a stringSplit function
that you can use on this assignment:
Vector<string> stringSplit(string s, string delimiter)
The string split function breaks a large string into a Vector of smaller string tokens; it accepts a
delimiter string parameter and looks for that delimiter as the divider between tokens. Here is an
example call to this function:
string s = "example;one;two;;three"
Vector<string> v = stringSplit(s, ";"); // {"example", "one", "two", "", "three"
}
The parts of a rule will be separated by whitespace, but once you've split the rule by spaces, the
spaces will be gone. If you want spaces between words when generating strings to return, you must
concatenate those yourself. If you find that a string has unwanted spaces around its edges, you can
remove them by calling the trim function, also from "strlib.h":
string s2 = " hello there sir ";
s2 = trim(s2); // "hello there sir"
Part 2: Generating Random Expansions from the Grammar
As mentioned previously, producing random grammar expansions is a two-step process. The first
step involves reading the input grammar file and turning it into an appropriate data structure (nonrecursively).
The second step involves recursively walking that data structure to generate elements
by successively expanding them.
Generate elements of a grammar with a recursive algorithm. To generate a random occurrence of a
symbol S:
If S is a terminal symbol, there is nothing to do; the result is the symbol itself.
If S is a non-terminal symbol, choose a random expansion rule R for S, and for each of the
symbols in that rule R, generate a random occurrence of that symbol.
For example, the grammar on the previous page could be used to randomly generate a <s> nonterminal
for the sentence, "Fred honored the green wonderful child", as shown in the
following diagram.
Random expansion from sentence.txt grammar for symbol <s>
If the string passed to your function is a non-terminal in your grammar, use the grammar's rules to
recursively expand that symbol fully into a sequence of terminals. For example, using the grammar
on the previous pages, a call of grammarGenerate("<np>") might potentially return the string,
"the green wonderful child".
Generating a non-terminal involves picking one of its rules at random and then generating each part
of that rule, which might involve more non-terminals to recursively generate. For each of these you
pick rules at random and generate each part, etc.
When you encounter a terminal, simply include it in your string. This becomes a base case. If the
string passed to your function is not a non-terminal in your grammar, you should assume that it is a
terminal symbol and simply return it. For example, a call of grammarGenerate("green") should
just return "green". (Without any spaces around it.)
Special cases to handle: You should throw a string exception if the grammar contains more than one
line for the same non-terminal. For example, if two lines both specified rules for symbol "<s>", this
would be illegal and should result in the exception being thrown. You should throw a string
exception if the symbol parameter passed to your function is empty, "".
Testing: The provided input files and main may not test all of the above cases; it is your job to come
up with tests for them.
Your function may assume that the input file exists, is non-empty, and is in a proper valid format. If
the number of times passed is 0 or less, return an empty vector.
Implementation Details:
The hardest part is the recursive generation, so make sure you have read the input file and built your
data structure properly before tackling the recursive part. The directory crawler program from
lecture is a good guide.
Loops are not forbidden in this part of the assignment. In fact, you should definitely use loops for
some parts of the code where they are appropriate. The directory crawler example from class uses a
for-each loop. This is perfectly acceptable; if you find that part of this problem is easily solved with a
loop, please use one. In the directory crawler, the hard part was traversing all of the diferent subdirectories,
and that's where we used recursion. For this program the hard part is following the
grammar rules to generate all the parts of the grammar, so that is the place to use recursion. If your
recursive function has a bug, try putting a cout statement at the start of your recursive function that
prints its parameter values, to see the calls being made. Look up the randomInteger function from
"random.h" to help you make random choices between rules.
Style Details:
(These are the same as in the Fractals problem.)
As in other assignments, you should follow our Style Guide for information about expected coding
style. You are also expected to follow all of the general style constraints emphasized in the
Homework 1 and 2 specs, such as the ones about good problem decomposition, parameters, using
proper C++ idioms, and commenting. The following are additional points of emphasis and style
contraints specific to this problem:
Recursion: Part of your grade will come from appropriately utilizing recursion to implement your
algorithm as described previously. We will also grade on the elegance of your recursive algorithm;
don't create special cases in your recursive code if they are not necessary. Avoid "arm's length"
recursion, which is where the true base case is not found and unnecessary code/logic is stuck into
the recursive case. Redundancy in recursive code is another major grading focus; avoid repeated
logic as much as possible. As mentioned previously, it is fine (sometimes necessary) to use "helper"
functions to assist you in implementing the recursive algorithms for any part of the assignment.
Variables: While not new to this assignment, we want to stress that you should not make any global
or static variables (unless they are constants declared with the const keyword). Do not use globals
as a way of getting around proper recursion and parameter-passing on this assignment.
Creative Aspect, mygrammar.txt:
Along with your code, submit a file mygrammar.txt that contains a BNF grammar that can be used
as input. For full credit, the file should be in valid BNF format, contain at least 5 non-terminals, and
should be your own work (do more than just changing the terminal words in sentence.txt, for
example). This is worth a small part of your grade. Pairs must submit two files, one from each
partner.
Frequently Asked Questions (FAQ):
For each assignment problem, we receive various frequent student questions. The answers to some
of those questions can be found by clicking the link below.
Q: I'd like to write a helper function and declare a prototype for it, but am I allowed to modify
the .h file to do so?
A: Function prototypes don't have to go in a .h file. Declare them near the top of your .cpp file and it
will work fine.
Q: What does it mean if my program "unexpectedly finishes"?
A: It probably means that you have infinite recursion, which usually comes when you have not set a
proper base case, or when your recursive calls don't pass the right parameters between each other.
Try running your program in Debug Mode (F5) and viewing the call stack trace when it crashes to see
what line it is on when it crashed.
Q: The spec says that I need to throw an exception in certain cases. But it looks like the
provided main program never goes into those cases; it checks for that case before ever calling
my function. Doesn't that mean the exception is useless? Do I actually have to do the exception
part?
A: The exception is not useless, and yes you do need to do it. Understand that your code might be
used by other client code other than that which was provided. You have to ensure that your function
is robust no matter what parameter values are passed to it. Please follow the spec and throw the
exceptions in the cases specified.
Q: I am confused about grammars; what this assignment is about?
A: In terms of this assignment, a grammar is a set of rules that corresponds to a sybmol. For instance,
the adj grammar relates to many diferent adjectives like purple, loud, sticky, and fake. To put it
simply, one grammar has many rules. This relationship is also known as non-terminal to terminal.
For this assignment, we want you to navigate through these grammars and create them. Notice that
some grammars have grammars in their rule sets as well. Upon encountering this situation,
remember that it is a grammar that needs to be generated as well.
Take a look at the decision tree in the assignment specifications, this should help give you a
visualization of the problem at hand.
Q: What type of data should my Map store?
A: Unforutantely, this aspect of the assignment is up to you to figure out. Remember the functions
you're going to have to call, along with the idea of a grammar and how it relates to its rules.
Q: I'm having trouble breaking apart the strings.
A: Use the provided stringSplit function. Print lots of debug messages to make sure that your split
calls are doing what you think they are.
Q: How do I spot nonterminal symbols if they don't start/end with "<" and ">"?
A: The "<" and ">" symbols are not special cases you should be concerned with. Treat them as you
would any other symbol, word, or character.
Q: Spaces are in the wrong places in my output. Why?
A: Try using other characters like "_", "~", or "!" instead of spaces to find out where there's extra
whitespaces. Also remember that you can use the trim function to remove surrounding whitespace
from strings.
Q: How do I debug my recursive function?
A: Try using print statements to find out which grammar symbols are being generated at certain
points in your program. Try before and afer your recursive step as well as in your base case. You can
also use the debugger.
Q: In my recursive case, why doesn't it join together the various pieces correctly?
A: Remember that setting a string = to another variable within a recursive step will not translate in all
instances of that function. Be sure to actively add onto your string instead of reassigning it.
Q: My recursive function is really long and icky. How can I make it cleaner?
A: Remind yourself of the strategy in approaching recursive solutions. When are you stopping each
recursive call? Make sure you're not making any needless or redundant checks inside of your code.
Q: When I run expression.txt, I get really long math expressions? Is that okay?
A: Yes, that's expected.
Q: What does it mean if my program "unexpectedly finishes"?
A: It might mean that you have infinite recursion, which usually comes when you have not set a
proper base case, or when your recursive calls don't pass the right parameters between each other.
Try running your program in Debug Mode (F5) and viewing the call stack trace when it crashes to see
what line it is on when it crashed. You could also try printing a message at the start of every call to
make sure that you don't have infinite recursion.
If you are certain that you do not have infinite recursion, it is possible that there are just too many
calls on the call stack because you are filling a very large area of the screen. If you tried to fill the
whole screen with the same color, it might crash with a "stack overflow" even if your algorithm is
correct. This would not be your fault.
Q: Can I use one of the STL containers from the C++ standard library?
A: No.
Q: I already know a lot of C/C++ from my previous programming experience. Can I use advanced
features, such as pointers, on this assignment?
A: No; you should limit yourself to using the material that was taught in class so far.
Q: Can I add any other files to the program? Can I add some classes to the program?
A: No; you should limit yourself to the files and functions in the spec.
Possible Extra Features:
Here are some ideas for extra features that you could add to your program:
Robust grammar solver: Make your grammar solver able to handle files with excessive
whitespace placed between tokens, such as:
" <adjp> ::= <adj> | <adj> <adjp> "
Inverse grammar solver: Write code that can verify whether a given expansion could possibly
have come from a given grammar. For example, "The fat university laughed" could come from
sentence.txt, but "Marty taught the curious students" could not. To answer such a question,
you may want to generate all possible expansions of a given length that could come from a
grammar, and see whether the given expansion is one of them. This is a good use of recursive
backtracking and exhaustive searching. (This is tricky.)
Other: If you have your own creative idea for an extra feature, ask your SL and/or the
instructor about it.
Indicating that you have done extra features: If you complete any extra features, then in the
comment heading on the top of your program, please list all extra features that you worked on and
where in the code they can be found (what functions, lines, etc. so that the grader can look at their
code easily).
Submitting a program with extra features: Since we use automated testing for part of our grading
process, it is important that you submit a program that conforms to the preceding spec, even if you
want to do extra features. If your feature(s) cause your program to change the output that it
produces in such a way that it no longer matches the expected sample output test cases provided,
you should submit two versions of your program file: a first one with the standard file name without
any extra features added (or with all necessary features disabled or commented out), and a second
one whose file name has the sufix -extra.cpp with the extra features enabled. Please distinguish
them in by explaining which is which in the comment header. Our turnin system saves every
submission you make, so if you make multiple submissions we will be able to view all of them; your
previously submitted files will not be lost or overwritten.
© Stanford 2018 | Website created by Chris Piech and Nick Troccoli.
CS 106B has been developed over time by many talented teachers.