辅导讲解extra程序、辅导讲解StarterC++、辅导java

- 首页 >> Java编程


Assignment 3B: Grammar Solver

Assignment by Marty Stepp and Victoria Kirst. Based on a problem by Stuart Reges of U of

This problem focuses on recursion.

This is a pair assignment. You are allowed to work individually or work with a single partner. If you

work as a pair, comment both members' names on top of every submitted code file. Only one of

you should submit the assignment; do not turn in two copies.

Links:

Starter Code

We provide a ZIP archive with a starter project that you should download and open with Qt Creator.

You will edit and turn in only the following file. The ZIP contains other files/libraries; do not modify

these. Your code must work with the other files unmodified. If you want to declare function

prototypes, declare them at the top of your .cpp file, not by modifying our provided .h file.

grammarsolver.cpp, the C++ code for your solution

mygrammar.txt, your custom grammar input file. If you work in a pair, you should submit this

file individually, labelling the two files mygrammar_SUNET1.txt and

mygrammar_SUNET2.txt.

demo JAR

If you want to further verify the expected behavior of your program, you can download the following

provided sample solution demo JAR and run it. If the behavior of our demo in any way conflicts with

the information in this spec, you should favor the spec over the demo.

"How do I run the assignment solution demos?"

If you want to further verify the expected behavior of your program, you can download the provided

sample solution demo JAR and run it. If the behavior of our demo in any way conflicts with the

information in this spec, you should favor the spec over the demo.

Our assignments ofer a solution 'demo' that you can run to see the program's expected behavior.

On many machines, all you have to do is download the .jar file, then double-click it to run it. But on

some Macs, it won't work; your operating system will say that it doesn't know how to launch the file.

If you have that issue, download the file, go to the Downloads folder in your Finder, right-click on the

file, and click Open, and then press Confirm.

Some Mac users see a security error such as, "cs106b-hw2-wordladder-demo.jar can't be opened

because it is from an unidentified developer." To fix this, go to System Preferences → Security &

Privacy. You will see a section about downloaded apps. You should see a prompt asking if you would

like to allow access to our solution JAR. Follow the steps and then you should be able to open the

demo.

If all else fails, you could run the demo JAR from a terminal. Every operating system allows you to

open a "terminal" or "console" window for typing raw system commands. Open your operating

system's terminal or console window (Google if you need to learn how to do this), and then type:

cd DIRECTORY_WHERE_DEMO_JAR_IS_STORED

java -jar JAR_FILE_NAME

For example, on a Mac machine, if your user name is jsmith12 and you have saved a demo JAR

named hw2.jar in your Documents/106b directory, you would type:

cd /users/jsmith12/Documents/106b

java -jar hw2.jar

Or on a Windows machine, if your user name is jsmith12 and you have saved a demo JAR named

hw2.jar in your Documents/106b directory, you would type:

cd C:\users\jsmith12\Documents\106b

java -jar hw2.jar

Problem Description:

For this part you will write a function for generating random sentences from a grammar.

Vector<string> grammarGenerate(istream& input, string symbol, int times)

Your function accepts a reference to an input file representing a language grammar, along with a

symbol to randomly generate and the number of times to generate it. Your function should generate

the given number of random expansions of the given symbol and return them as a Vector of strings.

A formal language is a set of words and/or symbols along with a set of rules, collectively called the

syntax of the language, defining how those symbols may be used together. A grammar is a way of

describing the syntax and symbols of a formal language. Even programming languages have

grammars; here is a link to a formal grammar for the Java language.

Many language grammars are described in a format called Backus-Naur Form (BNF), which is what

we'll use on this assignment. In BNF, some symbols in a grammar are called terminals because they

represent fundamental words of the language. A terminal in English might be "boy" or "run" or

"Mariana". Other symbols of the grammar are called non-terminals and represent high-level parts of

the language syntax, such as a noun phrase or a sentence. Every non-terminal consists of one or

more terminals; for example, the verb phrase "throw a ball" consists of three terminal words.

The BNF description of a language consists of a set of derivation rules, where each rule names a

symbol and the legal transformations that can be performed between that symbol and other

constructs in the language. For example, a BNF grammar for the English language might state that a

sentence consists of a noun phrase and a verb phrase, and that a noun phrase can consist of an

adjective followed by a noun or just a noun. Rules can be described recursively (in terms of

themselves). For example, a noun phrase might consist of an adjective followed by another noun

phrase, such as the phrase "big green tree" which consists of the adjective "big" followed by the

noun phrase "green tree".

A BNF grammar is specified as an input file containing one or more rules, each on its own line, of the

form:

non-terminal ::= rule|rule|rule|...|rule

A separator of ::= (colon colon equals) divides the non-terminal from its expansion rules. There will

be exactly one such ::= separator per line. A | (pipe) separates each rule; if there is only one rule for

a given non-terminal, there will be no pipe characters. Each rule will consist of one or more

whitespace-separated tokens.

The following is a valid example BNF input file describing a small subset of the English language.

Non-terminal names such as <s>, <np> and <tv> are short for linguistic elements such as sentences,

noun phrases, and transitive verbs.

<s>::=<np> <vp>

<np>::=<dp> <adjp> <n>|<pn>

<dp>::=the|a

<adjp>::=<adj>|<adj> <adjp>

<adj>::=big|fat|green|wonderful|faulty|subliminal|pretentious

<n>::=dog|cat|man|university|father|mother|child|television

<pn>::=John|Jane|Sally|Spot|Fred|Elmo

<vp>::=<tv> <np>|<iv>

<tv>::=hit|honored|kissed|helped

<iv>::=died|collapsed|laughed|wept

Sample input file sentence.txt

This grammar's language can represent sentences such as "The fat university laughed" and "Elmo

kissed a green pretentious television". This grammar cannot describe the sentence "Stuart kissed

the teacher" because the words "Stuart" and "teacher" are not part of the grammar. It also cannot

describe "fat John collapsed Spot" because there are no rules that permit an adjective before the

proper noun "John", nor an object afer intransitive verb "collapsed".

Though the non-terminals in the previous example language are surrounded by < >, this is not

required. By definition any token that ever appears on the lef side of the ::= of any line is

considered a non-terminal, and any token that appears only on the right-hand side of ::= in any

line(s) is considered a terminal. Do not assume that non-terminals will be surrounded by < > in your

code. Each line's non-terminal will be a non-empty string that does not contain any whitespace. You

may assume that individual tokens in a rule are separated by a single space, and that there will be no

outer whitespace surrounding a given rule or token.

Your grammarGenerate function will perform two major tasks:

1. read an input file with a grammar in Backus-Naur Form and turns its contents into a data

structure; and

2. randomly generate elements of the grammar (recursively)

You may want to separate these steps into one or more helper function(s), each of which performs

one step. It is important to segregate the recursive part of the algorithm away from the non-recursive

part.

You are given a client program that does the user interaction. The main function supplies you with

an input file stream to read the BNF file. Your code must read in the file's contents and break each

line into its symbols and rules so that it can generate random elements of the grammar as output.

When you generate random elements, you store them into a Vector which is returned. The provided

main program loops over the vector and prints the elements stored inside it.

Example Logs of Execution:

Your program should exactly reproduce the format and general behavior demonstrated in this log,

although you may not exactly recreate this scenario because of the randomness that your code

performs. (Don't forget to use the course File → Compare Output... feature in the console, or the

course web site's Output Comparison Tool, to check output for various test cases.)

Grammar file name? sentence.txt

Symbol to generate (Enter to quit)? <dp>

How many to generate? 3

1: the

2: the

3: a

Symbol to generate (Enter to quit)? <np>

How many to generate? 5

1: a wonderful father

2: the faulty man

3: Spot

4: the subliminal university

5: Sally

Symbol to generate (Enter to quit)? <s>

How many to generate? 7

1: a green green big dog honored Fred

2: the big child collapsed

3: a subliminal dog kissed the subliminal television

4: Fred died

5: the pretentious fat subliminal mother wept

6: Elmo honored a faulty television

7: Elmo honored Elmo

Expected output: Here are some additional expected output files to compare. It's hard to match the

expected output exactly because it is random. But your function should return valid random results

as per the grammar that was given to it. Your program's graphical Console window has a File →

Compare Output feature for checking your output.

test #1 (sentence.txt)

test #2 (sentence2.txt)

test #3 (expression.txt)

We have also written a Grammar Verifier web tool where you can paste in the randomly generated

sentences and phrases from your program, and our page will do its best to validate that they are

legal phrases that could have come from the original grammar file. This isn't a perfect test, but it is

useful for finding some kinds of mistakes and bugs.

Now that you've seen an example of the program's behavior, let's dive into the implementation

details of your algorithm.

Part 1: Reading the Input File

For this program you must store the contents of the grammar into a Map. As you know, maps keep

track of key/value pairs, where each key is associated with a particular value. In our case, we want to

store information about each non-terminal symbol. So the non-terminal symbols become keys and

their rules become values. Other than the Map requirement, you are allowed to use whatever

constructs you need from the Stanford C++ libraries. You don't need to use recursion on this part of

the algorithm; just loop over the file as needed to process its contents.

One problem you will have to deal with early in this program is breaking strings into various parts. To

make it easier for you, the Stanford library's "strlib.h" library provides a stringSplit function

that you can use on this assignment:

Vector<string> stringSplit(string s, string delimiter)

The string split function breaks a large string into a Vector of smaller string tokens; it accepts a

delimiter string parameter and looks for that delimiter as the divider between tokens. Here is an

example call to this function:

string s = "example;one;two;;three"

Vector<string> v = stringSplit(s, ";"); // {"example", "one", "two", "", "three"

}

The parts of a rule will be separated by whitespace, but once you've split the rule by spaces, the

spaces will be gone. If you want spaces between words when generating strings to return, you must

concatenate those yourself. If you find that a string has unwanted spaces around its edges, you can

remove them by calling the trim function, also from "strlib.h":

string s2 = " hello there sir ";

s2 = trim(s2); // "hello there sir"

Part 2: Generating Random Expansions from the Grammar

As mentioned previously, producing random grammar expansions is a two-step process. The first

step involves reading the input grammar file and turning it into an appropriate data structure (nonrecursively).

The second step involves recursively walking that data structure to generate elements

by successively expanding them.

Generate elements of a grammar with a recursive algorithm. To generate a random occurrence of a

symbol S:

If S is a terminal symbol, there is nothing to do; the result is the symbol itself.

If S is a non-terminal symbol, choose a random expansion rule R for S, and for each of the

symbols in that rule R, generate a random occurrence of that symbol.

For example, the grammar on the previous page could be used to randomly generate a <s> nonterminal

for the sentence, "Fred honored the green wonderful child", as shown in the

following diagram.

Random expansion from sentence.txt grammar for symbol <s>

If the string passed to your function is a non-terminal in your grammar, use the grammar's rules to

recursively expand that symbol fully into a sequence of terminals. For example, using the grammar

on the previous pages, a call of grammarGenerate("<np>") might potentially return the string,

"the green wonderful child".

Generating a non-terminal involves picking one of its rules at random and then generating each part

of that rule, which might involve more non-terminals to recursively generate. For each of these you

pick rules at random and generate each part, etc.

When you encounter a terminal, simply include it in your string. This becomes a base case. If the

string passed to your function is not a non-terminal in your grammar, you should assume that it is a

terminal symbol and simply return it. For example, a call of grammarGenerate("green") should

just return "green". (Without any spaces around it.)

Special cases to handle: You should throw a string exception if the grammar contains more than one

line for the same non-terminal. For example, if two lines both specified rules for symbol "<s>", this

would be illegal and should result in the exception being thrown. You should throw a string

exception if the symbol parameter passed to your function is empty, "".

Testing: The provided input files and main may not test all of the above cases; it is your job to come

up with tests for them.

Your function may assume that the input file exists, is non-empty, and is in a proper valid format. If

the number of times passed is 0 or less, return an empty vector.

Implementation Details:

The hardest part is the recursive generation, so make sure you have read the input file and built your

data structure properly before tackling the recursive part. The directory crawler program from

lecture is a good guide.

Loops are not forbidden in this part of the assignment. In fact, you should definitely use loops for

some parts of the code where they are appropriate. The directory crawler example from class uses a

for-each loop. This is perfectly acceptable; if you find that part of this problem is easily solved with a

loop, please use one. In the directory crawler, the hard part was traversing all of the diferent subdirectories,

and that's where we used recursion. For this program the hard part is following the

grammar rules to generate all the parts of the grammar, so that is the place to use recursion. If your

recursive function has a bug, try putting a cout statement at the start of your recursive function that

prints its parameter values, to see the calls being made. Look up the randomInteger function from

"random.h" to help you make random choices between rules.

Style Details:

(These are the same as in the Fractals problem.)

As in other assignments, you should follow our Style Guide for information about expected coding

style. You are also expected to follow all of the general style constraints emphasized in the

Homework 1 and 2 specs, such as the ones about good problem decomposition, parameters, using

proper C++ idioms, and commenting. The following are additional points of emphasis and style

contraints specific to this problem:

Recursion: Part of your grade will come from appropriately utilizing recursion to implement your

algorithm as described previously. We will also grade on the elegance of your recursive algorithm;

don't create special cases in your recursive code if they are not necessary. Avoid "arm's length"

recursion, which is where the true base case is not found and unnecessary code/logic is stuck into

the recursive case. Redundancy in recursive code is another major grading focus; avoid repeated

logic as much as possible. As mentioned previously, it is fine (sometimes necessary) to use "helper"

functions to assist you in implementing the recursive algorithms for any part of the assignment.

Variables: While not new to this assignment, we want to stress that you should not make any global

or static variables (unless they are constants declared with the const keyword). Do not use globals

as a way of getting around proper recursion and parameter-passing on this assignment.

Creative Aspect, mygrammar.txt:

Along with your code, submit a file mygrammar.txt that contains a BNF grammar that can be used

as input. For full credit, the file should be in valid BNF format, contain at least 5 non-terminals, and

should be your own work (do more than just changing the terminal words in sentence.txt, for

example). This is worth a small part of your grade. Pairs must submit two files, one from each

partner.

Frequently Asked Questions (FAQ):

For each assignment problem, we receive various frequent student questions. The answers to some

of those questions can be found by clicking the link below.

Q: I'd like to write a helper function and declare a prototype for it, but am I allowed to modify

the .h file to do so?

A: Function prototypes don't have to go in a .h file. Declare them near the top of your .cpp file and it

will work fine.

Q: What does it mean if my program "unexpectedly finishes"?

A: It probably means that you have infinite recursion, which usually comes when you have not set a

proper base case, or when your recursive calls don't pass the right parameters between each other.

Try running your program in Debug Mode (F5) and viewing the call stack trace when it crashes to see

what line it is on when it crashed.

Q: The spec says that I need to throw an exception in certain cases. But it looks like the

provided main program never goes into those cases; it checks for that case before ever calling

my function. Doesn't that mean the exception is useless? Do I actually have to do the exception

part?

A: The exception is not useless, and yes you do need to do it. Understand that your code might be

used by other client code other than that which was provided. You have to ensure that your function

is robust no matter what parameter values are passed to it. Please follow the spec and throw the

exceptions in the cases specified.

Q: I am confused about grammars; what this assignment is about?

A: In terms of this assignment, a grammar is a set of rules that corresponds to a sybmol. For instance,

the adj grammar relates to many diferent adjectives like purple, loud, sticky, and fake. To put it

simply, one grammar has many rules. This relationship is also known as non-terminal to terminal.

For this assignment, we want you to navigate through these grammars and create them. Notice that

some grammars have grammars in their rule sets as well. Upon encountering this situation,

remember that it is a grammar that needs to be generated as well.

Take a look at the decision tree in the assignment specifications, this should help give you a

visualization of the problem at hand.

Q: What type of data should my Map store?

A: Unforutantely, this aspect of the assignment is up to you to figure out. Remember the functions

you're going to have to call, along with the idea of a grammar and how it relates to its rules.

Q: I'm having trouble breaking apart the strings.

A: Use the provided stringSplit function. Print lots of debug messages to make sure that your split

calls are doing what you think they are.

Q: How do I spot nonterminal symbols if they don't start/end with "<" and ">"?

A: The "<" and ">" symbols are not special cases you should be concerned with. Treat them as you

would any other symbol, word, or character.

Q: Spaces are in the wrong places in my output. Why?

A: Try using other characters like "_", "~", or "!" instead of spaces to find out where there's extra

whitespaces. Also remember that you can use the trim function to remove surrounding whitespace

from strings.

Q: How do I debug my recursive function?

A: Try using print statements to find out which grammar symbols are being generated at certain

points in your program. Try before and afer your recursive step as well as in your base case. You can

also use the debugger.

Q: In my recursive case, why doesn't it join together the various pieces correctly?

A: Remember that setting a string = to another variable within a recursive step will not translate in all

instances of that function. Be sure to actively add onto your string instead of reassigning it.

Q: My recursive function is really long and icky. How can I make it cleaner?

A: Remind yourself of the strategy in approaching recursive solutions. When are you stopping each

recursive call? Make sure you're not making any needless or redundant checks inside of your code.

Q: When I run expression.txt, I get really long math expressions? Is that okay?

A: Yes, that's expected.

Q: What does it mean if my program "unexpectedly finishes"?

A: It might mean that you have infinite recursion, which usually comes when you have not set a

proper base case, or when your recursive calls don't pass the right parameters between each other.

Try running your program in Debug Mode (F5) and viewing the call stack trace when it crashes to see

what line it is on when it crashed. You could also try printing a message at the start of every call to

make sure that you don't have infinite recursion.

If you are certain that you do not have infinite recursion, it is possible that there are just too many

calls on the call stack because you are filling a very large area of the screen. If you tried to fill the

whole screen with the same color, it might crash with a "stack overflow" even if your algorithm is

correct. This would not be your fault.

Q: Can I use one of the STL containers from the C++ standard library?

A: No.

Q: I already know a lot of C/C++ from my previous programming experience. Can I use advanced

features, such as pointers, on this assignment?

A: No; you should limit yourself to using the material that was taught in class so far.

Q: Can I add any other files to the program? Can I add some classes to the program?

A: No; you should limit yourself to the files and functions in the spec.

Possible Extra Features:

Here are some ideas for extra features that you could add to your program:

Robust grammar solver: Make your grammar solver able to handle files with excessive

whitespace placed between tokens, such as:

" <adjp> ::= <adj> | <adj> <adjp> "

Inverse grammar solver: Write code that can verify whether a given expansion could possibly

have come from a given grammar. For example, "The fat university laughed" could come from

sentence.txt, but "Marty taught the curious students" could not. To answer such a question,

you may want to generate all possible expansions of a given length that could come from a

grammar, and see whether the given expansion is one of them. This is a good use of recursive

backtracking and exhaustive searching. (This is tricky.)

Other: If you have your own creative idea for an extra feature, ask your SL and/or the

instructor about it.

Indicating that you have done extra features: If you complete any extra features, then in the

comment heading on the top of your program, please list all extra features that you worked on and

where in the code they can be found (what functions, lines, etc. so that the grader can look at their

code easily).

Submitting a program with extra features: Since we use automated testing for part of our grading

process, it is important that you submit a program that conforms to the preceding spec, even if you

want to do extra features. If your feature(s) cause your program to change the output that it

produces in such a way that it no longer matches the expected sample output test cases provided,

you should submit two versions of your program file: a first one with the standard file name without

any extra features added (or with all necessary features disabled or commented out), and a second

one whose file name has the sufix -extra.cpp with the extra features enabled. Please distinguish

them in by explaining which is which in the comment header. Our turnin system saves every

submission you make, so if you make multiple submissions we will be able to view all of them; your

previously submitted files will not be lost or overwritten.

© Stanford 2018 | Website created by Chris Piech and Nick Troccoli.

CS 106B has been developed over time by many talented teachers.


站长地图