MUST FIT留学生辅导、辅导C/C++、讲解C/C++编程、辅导arithmetic
- 首页 >> C/C++编程Homework 1 – A Compiler for Simple Expressions
Zhiyao Liang
MUST FIT
zyliang@must.edu.mo
Wednesday 31 October 2018
Abstract
The task of this assignment is to implement a compiler that can
evaluate simple arithmetic expressions. [2] [1].
1 Preparation
You need to have some programming environment to edit, compile and run
C programs.
2 The ExprS language
We define a language called Expr–Simple (ExpS for short) in this section.
ExpS can be described in English as follows:
The picture above is found from the web, which is an example of simple math expression.
It is powerful, isn’t it?
1
An ID (variable) is a sequence of English letters.
A number is sequence of digits, possibly followed by a negative sign
. Only integers are allowed.
An expression is any composition of number, ID, and the 4 arithmetic
operators: + - * /, and parenthesis: (). The meaning of an expression
is the same as the arithmatic expression that we learned in middle
school. I.e., considering precedence, * and / are below (), and above
+ and -. The associativity of the operators is from left to right. The
value of an expression is an integer. Especially a division returns an
integer value, just like in C. For example, 7/3 is 2.
A statement is either an assignment statement or expression statement.
– An assignment as the form x = expr, followed by a carriage return (Return)
or end of file (EOF). That is, a variable name and an expression
appearing on the LHS and RHS of an assignment operator respectively.
This assignment declares a new variable x if x has not been introduced
before. Otherwise, if x is introduced earlier, it updates the value of x.
– An expression statement is an expression followed by Return or EOF.
Excution output. For an expression statement, the value of the expression is
printed out. For an assignment statement the output is optional: nothing,
or the value of the RHS expression is printed out.
Error report. When some lexical or grammar error is found, or a divisor is
found to be 0, an error message is printed out accordingly.
Runtime Environment. It covers two parts:
1. During the runtime, a datastructure is maintained that can support the
following operations efficiently:
– Record the names and values of all variables that are declared
(appeared on LHS of some assignment).
– Check if a name is declared or not.
– Obtain the value of a name.
– Update the value of a name.
2. Some method to evaluate any expression (obtain the value). It is based
on this datastructure (just mentioned), and the parse tree provided by
a parser) of the expression to be evaluated.
2
2.1 Lexical rules of ExpS
We use regular expression to describe the tokens. Here is some explanation
of notations:
Alternatives (logical or) is represented by a vertical bar |, instead of
the + sign.
Concatenation is represented by a center dot ·, instead of a space.
Three dots . . . means a range; E.g., a . . . z means all lowercase letters.
The tokens of ExpS are defined as follows by regular–expressions. A superscript
star
represents the kleen star.
Assign : =
Plus : +
M inus :
T imes :
Divid : /
LP aren : (
RP aren : )
Digit : 0 . . . 9
N atural : Digit · Digit?
Number : N atural | M inus · N atural
Letter : a . . . z | A . . . Z
ID : Letter · Letter?
Table 1: the lexical rules of tokens
2.2 Tokens of ExprS
The token types of Exps could be all of the names on the LHS of the rules
in Table 2.1, except Digit and Letter. Some additional token types could
be added, for example:
Return: It represents the carriage return appearing after each statement.
EOF: It means end of file. An extension of the compiler is to read a
script file of ExpS, which is a file containing a sequence of ExpS statements.
When the end of file is reached, an EOF token is generated.
QUIT: When this token appears, the program finishes.
3
3 Grammar of ExprS
The grammar of ExprS can be described by the following simple grammar. A
name starting with a lowercase letter is a variable, otherwise it is a constant
(token).
program → statementList
statementList → statement | statement · statementList
statement → assignment · end | exprssion · end
end → Return | EOF
assignment → ID · assign · expression
expression → Number | ID |
LP aren · expression · RP aren |
expression · operator · expression
operator → Plus | M inus | T imes | Divid
Note that the grammar rules in Table 3 is not ideal for writing a recursive–
descendent parser, since it does not represent the precedence and associativity
of operators.
4 A sample run of the calculator program
The following is what appears on screen in a sample run. A line starts with
> is a response from the caculator program, otherwise, it is a line typed by
a user.
> welcome to the caculator
x =3
x+3
> 6
y = 8
x+y/2
> 7
((3+5)/6)*(x+y/x+(3+4))
> 12
3+
> syntax error ...
6/0
> runtime error, divide by 0
3+!
4
> scanner error, ! is a problem
x+z
> z is not found
quit
> bye
5 Tasks to do
1. (5%) Adjust the grammar rules of ExprS so that it can describe the
priority and associativity of operators. Such a grammar can support
recursive–descendant parsing easily. Describe the grammar rules in a
separate text file or as comments in your source code.
2. Design and implement a compiler that covers the following parts:
(30%) A scanner of ExprS: Given a statement (just a statement)
of ExprS, the scanner generates a linked list of tokens.
(45%) A parser of ExprS: Given a linked list of tokens, the parser
generates a parse tree.
(15%) The runtime environment of ExprS, which is described in
Section 2.
(10%) Some code to test the compiler and interact with a user,
as shown in Section 4.
Note that there is no need to generate some target code.
3. Write a program that tests and uses this compiler. The program works
as an interpretor, like calculator that is described in Section 4.
5.1 Bonus
The following tasks will be considered as extensions and will receive bonus.
(5%) Using JFlap to draw a DFA that can describe the computation
of a scanner of ExprS. Use the notations described in class to mark
the edges of the DFA. Attached the JFlap file to the email.
5
(5%) Allowing floating point numbers (numbers with decimal points).
The value of variables and expressions can be floating points. Especially,
you can allow division of floating point numbers, just like C.
For example, the value 6/4 is 1, while 6.0/4 is 1.5.
(5%) Allowing = to be an operator, just like the C programming language.
For example: The value of x=(y=z=6)*5 is 30. Only variable names can
appear on the LHS of =. The precedence of = is below all other operators.
You need to adjust the grammar accordingly.
(5%) Allowing a program of ExprS to be saved in some script file. A statement
of
load filename
will load and run the file with the filename; all statements in the file will be
executed in batch. When running a script file, the execution may stop when
an error is found.
6 Submitting your homework
Due time: 10 December 2018 Monday, 7:00pm.
At most 3 students can form a group to do this homework together.
At the top of the every text file that you changed or created (maybe
just the .c and .h files), record your name (if mutiple students forma
group, provide all the names) and date as comments.
Please attach all of your source code files to the email, possibly zip
them into a single file. Any information about how to run and understand
your code (a readme file) will be helpful.
Attention: Do not attach any binary files (.o, .obj, .exe, .out), since
google will consider them virus. Just send the source code and textual
documents.
Send your email to:
zyliangstu@gmail.com
The title (subject) of your email should be like:
[name][hmkNum info]
6
Name can be in Chinese or English (Pingying). For example, if a
student wants to submit homework 5, whose name Shan Li, then the
title of the email should be:
[Li, Shan][hmk1 expression]
If it is a group work, include all members’ name in the title.
References
[1] Andrew W. Appel. Modern Compiler Implementation in C: Basic Techniques.
Cambridge University Press, New York, NY, USA, 1997.
[2] Kenneth C. Louden. Compiler Construction Principles and Practice.
PWS Publishing Company, 1997. ISBN 0-534-93972-4
http://www.cs.sjsu.edu/~louden/cmptext/.