CS 515: Programming Languages and Compilers I
Fall 2011, Project 1
An Optimizing Compiler for A Small Imperative
Language
DUE DATE: Tuesday, October 18, at 11:59pm EST
Modifications and Clarifications
- The sampleCodegen compiler does not handle repeat-until statements.
Project Description: PART I
THIS IS NOT A GROUP PROJECT!
Every student is expected to work on his/her own project. You may
discuss overall design issues with your fellow students. Detailed
discussions and/or code sharing is not allowed. The
general rules for academic integrity apply.
Write a SDT (syntax directed translation) scheme to generate code for
a simple imperative language shown below. The language
does not contain any procedures, but only a single main program.
Base types are limited to integer only. Arrays are one-dimensional or
two-dimensional with the integer type as its component and index type.
The following statements are included:
for-do, repeat-until, if-then, if-then-else, assignment, write, and compound statement.
Operators are restricted to arithmetic and relational.
The grammar that we are using for this language is as follows:
| start |
::= |
program ID ; block . |
| block |
::= |
variables cmpdstmt |
| variables
| ::=
| var vardcls | empty string
|
| vardcls
| ::=
| vardcls vardcl ; | vardcl ;
|
| vardcl
| ::=
| IDlist : type
|
| type
| ::=
| integer |
array[ ICONST ] of integer |
array[ ICONST, ICONST ] of integer
| IDlist
| ::=
| IDlist , ID | ID
|
| stmtlist
| ::=
| stmtlist ; stmt | stmt
|
| stmt
| ::=
| ifstmt | fstmt | rstmt | astmt | writestmt | cmpdstmt
|
| cmpdstmt
| ::=
| begin stmtlist end
|
| writestmt
| ::=
| writeln ( exp )
|
| ifstmt
| ::=
| ifhead then stmt else stmt | ifhead then stmt
|
| ifhead
| ::=
| if condexp
|
| fstmt
| ::=
| for ctrlexp do stmt
|
| rstmt
| ::=
| repeat stmt until condexp
|
| ctrlexp
| ::=
| ID := ICONST, ICONST
|
| astmt
| ::=
| lhs := exp
|
| lhs
| ::=
| ID |
ID [ exp ] |
ID [ exp, exp ]
|
| exp
| ::=
| exp + exp |
exp - exp |
exp * exp |
ID |
ID [ exp ] |
ID [ exp, exp ] |
ICONST
|
| condexp
| ::=
| exp != exp | exp == exp | exp <
exp | exp <= exp
| |
You may assume that the program is correct in terms of static
semantics, i.e., no semantic analysis (type checking) is required.
You will write a syntax directed translation scheme that will generate
ILOC code for the above language. You may test the correctness of your
generated ILOC code by running it on the ILOC simulator sim provided in
directory ~uli/cs515/projects/proj1/ILOCsimulator/src on the
ilab machines . This directory
also contains the source code of the ILOC simulator.
Code Shape Requirements
-
Your code should use the register-register model that exposes
the maximal opportunities for register allocation. In other words,
each new value should reside in a separate virtual
register . The
function NextRegister will return a new (fresh) register number.
-
The first element of an array a is a[0] (
one-dimensional) or a[0,0] (two-dimensional).
All addresses are byte addresses and an integer value is stored
in a 4 byte word. The data layout for two-dimensional arrays should
be column-major order. The overall available memory is set to 20,000
bytes. For instance, if you specify array x[100,100] of integer ,
the simulator will complain!
-
All variables are statically allocated, i.e., there is no need for
activation records. The static area starts at
memory location 1024. Addresses above are reserved for register
spilling. The register r0
should contain the starting address (namely 1024) of the static area
during program execution.
-
You may only use the following ILOC instructions. All these
instructions are implemented in sim , our ILOC simulator.
Click here to look at a table of ILOC
instructions and their semantics.
- no operation: nop .
- arithmetic: addI, add, subI, sub, mult .
- memory load, loadI, loadAO, loadAI, store, storeAO, storeAI .
- control flow: br, cbr, cmp_LT, cmp_LE, cmp_EQ, cmp_NE,
cmp_GT, cmp_GE .
- I/O: output .
Please see files instrutil.h and instrutil.c for
the definitions of procedures/functions emit, emitComment,
NextRegister, and NextLabel.
-
You may want to generate nop instructions as targets of
branches and conditional branches, e.g., L1: nop .
-
The evaluation of an exp will always result in an integer
value, while the evaluation of a condexp will always
result in a boolean (0 or 1) value. An ILOC cmp_ instruction
writes a boolean value into its target register.
-
The function NextLabel will generate a new (fresh) label each time
it is called.
Project Description: PART II
Implement an optimization pass that performs local common subexpression
elimination (CSE) at the ILOC instruction level , not the
source level. Local CSE works on basic blocks.
You may generate a separate data structure to store the generated
instructions for a basic block, or you can perform CSE "on
the fly", i.e., generate optimized code using syntax-directed
translation.
HINT: If you use new virtual registers for every value that is
computed in ILOC, you may use these virtual registers as value
numbers , i.e., unique identifiers of computed values.
The CSE optimizer should be invoked using the -O option, i.e.,
execute ./codegen -O < testcases/demo1 will run your optimizing compiler
on the demo1 input file. The sample solution has been extended to
include a CSE optimizer. The solution does not consider
any arithmetic properties during CSE detection such as the
commutativity of operands "+ and "*".
How To Get Started
The following code is provided as a starting point for your
project. Please copy
the files from the directory ~uli/cs515/projects/proj1 on the
ilab machines. You can also click on the links below and
copy the files one at a time.
- Scanner: scan.l (flex). You will need to
add tokens for the repeat-until construct.
- Parser/Optimizer/Code Generator: parse.y
(bison). Here is where most of your code will go. It contains an
example of how to use procedure emit to generate code. You will
need to remove this in your final version, i.e., it has only be
inserted as an illustration example.
- attr.h and attr.c .
You will need to define new attribute(s)
- symtab.h and symtab.c .
Needs to be modified.
- instrutil.h and instrutil.c .
- Makefile
For the CSE optimizer, you may want to add additional files. You need
to make appropriate changes in the Makefile.
In order to get started on testing your compiler, you can use the
following test cases.
This is just a tentative list or source codes and their
generated sample ILOC code using our
code generator sample solution
(sampleCodegen). We will use many more test cases to
grade your project . There are many ways of generating
correct code, so our sampleCodegen compiler gives you only
an overall idea what needs to be done. The sample code generator
does not perform any CSE optimizations.
The generator
can be found in directory ~uli/cs515/projects/proj1 on the
ilab machines.
- Basic straight line code:
- Basic code with control flow:
- demo2 ( demo2.out )
- demo3 ( demo3.out )
- Basic ode with control flow and array references:
- demo4 ( demo4.out )
- demo5 ( demo5.out )
- demo6 ( demo6.out )
- Optimized code ( codegen -O ) with control flow and array references:
- demo7 ( demo7.out )
- demo8 ( demo8.out )
You can generate an executable called codegen by
typing make . The parser/code generator expects the input on
stdin, i.e., you can call the parser on an input program as
follows: codegen < demo1 . The parser/code generator writes
the resulting ILOC code into file iloc.out.
Due Date
See on the top of the page.
If you have specific, overriding,
personal reasons why these dates are unreasonable, you should discuss
them directly with the instructor before the deadline.
Submission Procedure
Please submit a single tar-file with all your source files, including
the ReadMe file. Your ReadMe file may contain
comments that you want the grader to know about. Do a "make clean"
before tar-ing your files.
Do not submit your compiler as an executable. Do not
submit the simulator or the provided sample solution.
We have to be
able to recreate your compiler on the ilab machines by saying "make".
Your submitted Makefile has to reflect the appropriate changes.
We will use the following late policy : You will receive a penalty of
20% of your overall grade for each day late. A day is a working day
(Monday through Friday).
Grading Criteria
The project will be mainly graded on functionality. You will receive
no credit for the entire project if we cannot
recreate (make) your compiler or your compiler does not run on any
of our test codes.