CS 415: Compilers
Project 2: A Pascal Subset Compiler Front-End
Due date: Wednesday, April 17 -- changed to April 20.
Announcement
- The user manuals for flex, yacc and bison can be found here.
- Project 2 Available (April/1/2016).
Pascal Subset
Using flex and bison, you are to write a parser and code generator for a subset of Pascal.
Programs consist of single blocks. There are no procedure
declarations. Base types are limited to integer and boolean.
Composite types are limited to single dimensional arrays of base types,
indexed by integers.
Only the following statements are included:
while-do, for (single assignment in body), if-then, if-then-else, assignment, write, and compound.
Operators are restricted
to arithmetic, logical, and relational.
The grammar that we are using
for this language is as follows:
| start |
::= |
program ID ; block . |
| block |
::= |
variables cmpdstmt |
| variables
| ::=
| var vardcls | empty string
|
| vardcls
| ::=
| vardcls vardcl ; | vardcl ;
|
| vardcl
| ::=
| IDlist : type
|
| IDlist
| ::=
| IDlist , ID | ID
|
| type
| ::=
| array [ integer_constant .. integer_constant ] of stype
|
| | | | stype
|
| stype
| ::=
| integer | boolean
|
| stmtlist
| ::=
| stmtlist ; stmt | stmt
|
| stmt
| ::=
| ifstmt | astmt | wstmt | fstmt |
cmpdstmt | writestmt
|
| wstmt
| ::=
| while condexp do stmt
|
| fstmt
| ::=
| for ID := constant ,
constant do stmt
|
| ifstmt
| ::=
| ifhead then stmt else stmt | ifhead then stmt
|
| ifhead
| ::=
| if condexp
|
| cmpdstmt
| ::=
| begin stmtlist end
|
| writestmt
| ::=
| writeln ( exp )
|
| astmt
| ::=
| lvalue := exp
|
| exp
| ::=
| rvalue | exp + exp | exp - exp | exp * exp
|
|
| |
| exp and exp | exp or exp | exp
exor exp | not exp
|
|
| |
| ( exp ) | constant
|
| condexp
| ::=
| exp != exp | exp == exp | exp <
exp | exp <= exp | ID |
boolean_constant
|
| lvalue
| ::=
| ID | ID [ exp ]
|
| rvalue
| ::=
| ID | ID [ exp ]
|
| constant
| ::=
| integer_constant | boolean_constant
|
| integer_constant
| ::=
| ICONST
|
| boolean_constant
| ::=
| true | false
|
Front-End: Project Description
As part of this project, you will
- Augment the scanner to recognize the key words (tokens) while
, do , for , and boolean ,
- Write the parser and detect and report at least the following
syntax errors by inserting appropriate error productions :
- "\n***Error: illegal variable declaration\n"
- "\n***Error: illegal statement\n"
- "\n***Error: illegal expression\n"
- "\n***Error: illegal conditional expression\n"
Note: Reporting the same error occurrence
multiple times is okay.
The grammar given above is ambiguous. However, we used features of bison
to deal with operator precedence and associativity to
resolve the ambiguity (read the bison manual here).
- Augment your parser to perform semantic analysis, in particular type checking.
This will include the implementation a single-level symbol table.
Your type checker has to detect and report at least the
following semantic errors:
- Declarations
"\n***Error: duplicate declaration of %s\n"
- Array declaration
\n***Error: lower bound exceeds upper bound\n"
- If stmt
"\n***Error: exp in if stmt must be boolean\n"
- While stmt:
"\n***Error: exp in while stmt must be boolean\n"
- Writeln stmt (Note: accepts only integer values)
"\n***Error: illegal type for write\n"
- Assignment stmt
"\n***Error: assignment types do not match\n"
"\n***Error: assignment to whole array\n"
- Array subscript operation id[exp] in exp
"\n***Error: subscript exp not type integer\n"
"\n***Error: id %s is not an array\n"
- for statement
"\n***Error: lower bound not integer constant\n"
"\n***Error: upper bound not integer constant\n"
"\n***Error: lower bound exceeds upper bound\n"
"\n***Error: induction variable not scalar integer variable\n"
- Identifier
"\n***Error: undeclared identifier %s\n"
- Arithmetic, logical, and relational operators in exp
"\n***Error: types of operands for operation %s do not match\n";
Specific Details
Names are case sensitive .
Efficiency is not the main concern of this project. You may use a
simple unordered list to implement your symbol table. If you design
your symbol table interface well, it should be simple to replace the
underlying table implementation with a hashing scheme later.
Logical operators (and, or, exor, not) require boolean arguments. The
arithmetic and relational operators (e.g., +) require integer arguments.
Integer and boolean arguments cannot be mixed.
The equality/inequality operators (==, !=) apply to any type, as long as
the two argument types match.
Getting Started
The following code is provided as a starting point for your
project. The files listed below live on the ilab cluster in
subdirectory
~zz124/cs415_2016/projects/proj2 . Please copy the
files
into your subdirectory of choice using the command
cp -r ~zz124/cs415_2016/projects/proj2 myProjectSubdirectory .
You may modify any of these files for your project.
- declaration of types and attribute operations:
attr.h and attr.c
There is nothing useful yet in these files.
- declaration of symbol table and symbol table operations:
symtab.h and symtab.c
There is nothing useful yet in these files.
- Scanner: scan.l (flex)
- Parser: parse.y (bison)
- Makefile
- demo-simple, a program
that can be successfully parsed with the partial parser that can
be generated from the provided code.
- ReadMe files contains a few reminders
In order to test your compiler, we provide ten test cases in the "testcases" subfolder of proj2. We
will use more hidden test cases for grading purpose.
You can generate an executable called parser by
typing make . The parser expects the input on
stdin, i.e., you can call the parser on an input program as
follows: ./parser < demo-simple .
Due Date
The project is due at 11:59 pm on April 17,
2016--that is, one minute before midnight.
If you have specific, overriding,
personal reasons why these dates are unreasonable, you should discuss
them directly with the instructor before the deadline.
Submission Procedure
Please submit a single tar-file with all your source files, including scan.l, attr.h, attr.c, symtab.h,
symtab.c, parse.y, ReadMe, to Sakai. The submission window for project II is open. Your ReadMe
file may contain comments that you want the grader to know about. Do a "make clean" before tar-ing
your files. Do not submit your parser as an executable.
We will use the following late policy : You will receive a penalty of 20% of
your overall grade for each day late. A maximum of three days is allowed.
It is your responsibility to verify
that your files have been submitted correctly .
Grading Criteria
The project will be mainly graded on functionality. You will receive
no credit for the entire project if your
code does not compile and run on the ilab
cluster .
We may use automatic testing tools, so make sure that your error messages are
exactly as described above.