c++编程辅导、辅导c++编程、辅导c++程序、c++编程辅导、c++辅导、c++程序辅导、c++语言辅导

2018.09.02 - 首页 >> C/C++编程

Project Description

In this assignment you will implement two versions of a tokeniser that

with minor changes could be used to complete variations of projects 6,

10 and 11 in the nand2tetris course. A detailed description of the

requirements are shown below. The exectuable program, tokeniser will

read text from standard input and produce a list of all tokens in the text

on standard output.

SVN Repository

You must create a directory in your svn repository named: <year>/

<semester>/cs/assignment1. This directory must only contain the

following files and directories - the web submission system will check

this:

• Makefile - this file is used by make to compile

your tokeniser program - do not modify this file.

• tokeniser.cpp C++ source files containing

the next_token() function.

• my*.cpp C++ source files with names that start with my

• my*.h C++ include files with names that start with my

• lib - this directory contains precompiled programs and components

- do not modify this directory.

• includes - this directory contains .h files for precompiled classes -

do not modify this directory.

• tests - this directory contains sample test data, it can be used to

store any extra files you need for testing

Note: if the lib/tokens.o file does not get added to your svn repository

you will need explicitly added it using:

% svn add lib/tokens.o

Submission and Marking Scheme

This assignment has two assignments in the web submission

system named: Assignment 1 - Milestone

Submissions and Assignment 1 - Final Submissions. The assessment

is based on "Assessment of Programming Assignments".

Assignment 1 - Milestone Submissions: due 11:55pm Tuesday of week 7

The marks awarded by the web submission system for the milestone

submission contribute up to 20% of your marks for assignment

1. Your milestone submission mark, after the application of late penalties,

will be posted to the myuni gradebook when the assignment marking is

complete.

Your programs must be written in C++ and will be tested using the set

of test files that are attached below. Although a wide range of tests may

be run, including a number of secret tests, marks will only be recorded

for those tests that test the milestone token definitions shown below. Your

programs will be compiled using the Makefile included in the zip file

attached below. Any .h or .cpp files that you create, in addition to the

skeletons provided, must have names that start with my.

The Makefile will use all of the my*.cpp files in your svn directory as part

of the tokeniser program that it compiles.

Assignment 1 - Final Submissions: due 11:55pm Friday of week 7

The marks awarded for the final submission contribute up to 80% of your

marks for assignment 1.

Your final submission mark will be the geometric mean of the marks

awarded by the web submission system, a mark for your logbook and a

mark for your code. It will be limited to 20% more than the marks

awarded by the web submission system. See "Assessment - Mark

Calculations" for examples of how the marks are combined. Your final

submission mark, after the application of late penalties, will be posted to

the myuni gradebook when the assignment marking is complete.

Automatic Marking

The automatic marking will compile and test your tokeniser program in

exactly the same way as for the milestone submission. The difference is

that marks will be recorded for all of the tests including

the secret tests. Note: if your program fails any of these secret tests

you will not receive any feedback about these secret tests, even if you

ask!

Logbook Marking

Important: the logbook must have entries for all work in this assignment,

including your milestone submissions. See "Assessment - Logbook

Review" for details of how your logbook will be assessed.

Code Review Marking

For each of your programming assignments you are expected to submit

well written code. See "Assessment - Code Review" for details of how

your code will be assessed.

Tokenisers

Background

The primary task of any language translator is to work out how the

structure and meaning of an input in a given language so that an

appropriate translation can be output in another language. If you think of

this in terms of a natural language such as English. When you attempt to

read a sentence you do not spend your time worrying about what

characters there are, how much space is between the letters or where

lines are broken. What you do is consider the words and attempt to

derive structure and meaning from their order and arrangement into

English language sentences, paragraphs, sections, chapters etc. In the

same way, when we attempt to write translators from assembly language,

virtual machine language or a programming language into another form,

we attempt to focus on things like keywords, identifiers, operators and

logical structures rather than individual characters.

The role of a tokeniser is to take the input text and break it up into tokens

(words in natural language) so that the assembler or compiler using it only

needs to concern itself with higher level structure and meaning. This

division of labor is reflected in most programming language definitions in

that they usually have a separate syntax definition for tokens and another

for structures formed from the tokens.

The focus of this assignment is writing a tokeniser to recognise tokens

that conform to a specific set of rules. The set of tokens may or may not

correspond to a particular language because a tokeniser is a fairly generic

tool. After completing this assignment we will assume that you know how

to write a tokeniser and we will provide you a working tokeniser to use in

each of the remaining programming assignments. This will permit you to

take the later assignments much further than would be otherwise possible

in the limited time available.

Writing Your Program

You are required to complete the implementation of the C++

file tokeniser.cpp which is used to compile the program tokeniser. You

will implement a function, next_token(), that will read text character by

character using the static function nextch(), and return the next

recognised token in the input. The tokens that must be recognised in the

milestone and final submissions are specified in separate tables below.

If you wish to write any of your code in separate .cpp or .h files, the

names of the additional files must all start with my. All files

matching my*.cpp will be automatically included when compiling

the tokeniser program. Your programs will be compiled using

the Makefile in the zip file attached below using the command:

% make

Note: Do not modify the Makefile or the subdirectories

includes and lib. They will be replaced during testing by the

web submission system.

Testing Your Program

For each file in the tests directory, the output of the tokeniser program

must match the corresponding .tokens output file. You must not produce

any output of your own. You can test your program against all of the

supplied tests using the command:

% make test

The testing will not show you any program output, just whether or not a

test was passed or failed. If you want to see the actual output, the

commands used to run the tests are shown in string quotes ("). Simply

copy the commands and paste them into your shell.

The web submission system will test your program in exactly this way.

The key difference between your testing and the web submission testing

is that the web submission system has some secret tests that it will use.

If you want to try additional tests, just create some new files in

the tests sub-directory and generate the correct outputs using the

command:

% make test-new

This will increase the number of tests that will be run in the future.

Milestone Tokens

Your milestone submission will be marked using tests that require the

correct recognition of the following tokens:

Notes:

• all input must be read using the function nextch()

◦

• if the end of input is reached, return the token EOI.

• it a character is found that cannot be part of a token or is not a

space " ", tab "\t", carriage return "\r" or newline "\n", return the

token BAD.

• letter, digit19 and digit are never returned as token classes

• all tokens must be contiguous characters in the input

• when searching for the start of the next token all spaces

and newlines are ignored

• in a definition the or operator | separates alternative components

Token Definition Example

Token

IDENTIFIER ::

= letter ( letter | digit )* _he82mUch

INT ::

= '0' | ( digit19 digit* ) 17

SYMBOL ::

';' | ':' | '!' | ',' | '.' | '=' | '{' | '}' | '(' | ')' |

'[' | ']' | '@' ;

Additional

Rules Definition Example

Text

letter ::

= 'a'-'z' | 'A'-'Z' | '_' "C"

digit19 ::

= 1'-'9' "1"

digit ::

= 0'-'9' "0"

• in a definition the round brackets ( ) which are not inside single

quotes are for grouping components of token

• in a definition the square brackets [ ] which indicates that the

enclosed components may appear 0 or once

• in a definition the star character * indicates that the preceding

component of a token may appear 0 or more times

Final Submission Tokens

Your final submission will be marked using tests that require the correct

recognition of the following tokens in addition to the milestone tokens

listed above.

Additional Notes:

• this tokeniser does not ignore comments, it returns them as tokens

• single-line comments start with "//" and finish at the next newline

character "\n"

• multi-line comments start with "/*" and continue until the first "*/",

the shortest multi-line comment is "/**/"

• keyword tokens are only to be recognised by interpreting an

identifier token

• use the string_to_token() function to check if an identifier is

actually a keyword

Tests

In addition to the test files in the zip file attached below, we will use a

number of secret tests that may contain illegal characters or character

combinations that may defeat your tokenisers. Note: these tests

Token Definition Example

Token

DOUBLE ::

= ( '0' | ( digit19 digit* ) ) [ '.' digit* ] 17.05

KEYWORD ::

'string' if

ONELINECOMMENT

'//' ( any character except

newline ) newline // oneliner

MULTI-LINE

COMMENT

'/*' ( any characters up to the first

'*/' ) '*/' /* hello */

are secret, if your programs fail any of these secret tests you will

not receive any feedback about these secret tests, even if you ask!