CS 515: Programming Languages and Compilers I
Fall 20011, Project 3
A Vectorizing Compiler
Due date: Saturday, December 17, at 11:59pm EST


Modifications and Clarifications

Project Description

In this project, you will write a vectorizing compiler in C++. The input language is C. An input programs consists of a single, singly-nested forloop. The loop body contains a sequence of assignments, i.e., loop bodies are a single basic block. Left-hand-sides of these assignments are always 1-dimensional array references, and right-hand-sides may contain 1-dimensinonal array references or integer constants. The only integer variable in the program is the loop induction variable. Arrays are implictly declared, and their particular size is assumed to match their use within the for loop. In other words, no need to worry about out-of-bounds accesses. You can assume that in the for loop header, the loop bounds are constant, and the lower bound assigned to the induction variables is never greater than the upper bound. The step is always 1, i.e., the induction variable is always incremented by 1 (e.g.: i++;). Finally, every array index expression is either i, i+c, i-c, or c, where i is the loop induction variable and c is an integer constant. All these restrictions are there to make the basic compilation process easier so that we can concentrate on the automatic vectorization algorithm.

The basic vectorization algorithm has the following steps

Examples

  1. Single statement SCC; no dependences
         for (i=0; i<100; i++) {
    S1:    a[i] = a[i] + 1      is transformed to      S1:   a[0:99] = a[0:99] + 1;
         }
    
    Since our system cannot generate vector statements, your compiler should generated the following code where vector statements are represented as single statement loops with the statement flagged as vectorizable. Sequential loops remain unchanged, i.e., are not flagged.
         for (i=0; i<100; i++) {
           /* vector statement */ a[i] = a[i] + 1;  
         }
    
  2. Multipe statement SCC; loop carried dependences; code generation in topological order
         for (i=1; i<99; i++) {                                  S2:  b[1:98] = (c[1:98] + b[2:99]) / 2; 
    S1:    a[i] = b[i-1] + c[i-1] + 5;                                for (i=1; i<99; i++) {
    S2:    b[i] = (c[i] + b[i+1]) / 2;   is transformed to       S1:    a[i] = b[i-1] + c[i-1] + 5; 
    S3:    c[i] = a[i] + 1;                                      S3:    c[i] = a[i] + 1;
         }                                                            }
    
    UPDATED OUTPUT CODE - NO LOOP DISTRIBUTION
    The code that your compiler should generate is:
         for (i=1; i<99; i++) {
           /* vector statement */ b[i] = (c[i] + b[i+1]) / 2;
           a[i] = b[i-1] + c[i-1] + 5;
           c[i] = a[i] + 1;
         }
    
  3. Dependence testing that considers loop bounds
         for (i=1; i<99; i++) {
    S1:    a[i] = a[i] + a[0] + a[99]     is transformed into      S1: a[1:98] = a[1:98] + a[0] + a[99];
         }
    
    The code that your compiler should generate is:
         for (i=1; i<99; i++) {
           /* vector statement */ a[i] = a[i] + a[0] + a[99];
         }
    

Strongly Connected Components

You will need to find strongly connected components (SCC) of the dependence graph and get a topological ordering of the SCC. For example, the topological ordering of SCC of the following graph is:

1. [ 1 ]
2. [ 2 3 4 ]
3. [ 5 6 ]
               __________
              /          v
1 --> 2 ---> 3 --> 5 --> 6
      ^     /      ^____/
      |    /
      4 <-'

Provided infrastructure

As with the other projects, we use the machines in the ilab cluster. The ilab cluster page contains the listing of valid hostnames available for this project. Be aware that ilab.rutgers.edu, cereal.rutgers.edu, pasta.rutgers.edu and soup.rutgers.edu are NOT to be used . Remember to use a type of cereal (trix.rutgers.edu), a type of pasta (macaroni.rutgers.edu), etc. You have the same home directory across all machines of the ilab cluster .

For this project, you are given C++ code that uses the LLVM libraries to traverse a C source file and extract information about a FOR loop. You will use this information to do your dependency analysis to determine if and how the assignment statements in the loop can be vectorized.

BUILD INSTRUCTIONS

You can copy all provided code files into your project subdirectory, for example "myProject", by using the following UNIX/Linux command on any ilab machine :

  cp -r  ~uli/cs515/projects/proj3 myProject 

To build the project, you need to add the path to the LLVM executables to your iLab PATH environment variable. In bash, this is done as follows:

  export PATH=$PATH:/ilab/users/jasperry/llvm-2.9/bin
If you are using tcsh (which may still be default for the iLab), then it looks like this:
  setenv PATH ${PATH}:/ilab/users/jasperry/llvm-2.9/bin
These executables are used to automatically generate all the compiler flags, as you can see from the Makefile. So after setting the path, all you need to do is type "make" in the project directory to build the executable "analyze". This program will read the source file "in.c", print some informational output to the terminal, and rewrite the source code into "out.c".

IMPORTANT: Make sure that your input source file contains no #includes, only one main() function, and a SINGLE for loop in the body. You can follow the pattern of the provided "in.c".

How to get started

The main function is found in "analyze.cpp". It creates an object of type LoopAnalyzer, which is derived from LLVM's ASTConsumer class. LoopAnalyzer has a method "processForLoop" that does most of the work. You don't need to know anything about the internals of LoopAnalyzer; you just call "parse" as in the sample code. After your analysis is finished, you need to call the method "writeSource" to write out the final result. This is also provided in the main function of "analyze.cpp".

You need to know primarily about two other classes: SubscriptExpr and LoopInfo. A SubscriptExpr represents one subscript expression for a single-dimensional array. A SubscriptExpr object, defined in SubscriptExpr.h, contains the name of the array, the number of the statement to which the subscript expression belongs, and a list of coefficients. You will not need to use the coefficients directly yourself; they are used by the dependency checking functions you are given. What you WILL need to know is which statement the reference is in, and the name of the arry it's in. You can get these by the public members "stmtIndex" and "arrayName".

To help your debugging, you can print a SubscriptExpr to the standard output using

    llvm::outs() << expr;
Don't forget the parentheses. You can't use the standard C++ I/O streams for LLVM objects, so it's best to stick to llvm::outs() for printing things out.

Integers in the object language are represented by the datatype llvm::APSInt. You can operate on these as normal integers and print them out using llvm::outs(), but their type is not compatible with C++ (metalanguage) ints.

After the LoopAnalyzer traverses the source, its member LoopInfo object will contain all the information you need to do the remainder of the project. The LoopInfo class stores information about a FOR loop structure, including its index variable and bounds, plus two vectors of SubscriptExpr's: "reads" and "writes". You should examine the header file "LoopInfo.h" to understand what data and methods are available to you.

The LoopInfo object also contains the methods "insertStatementComment" for inserting comments before each of the statements in the FOR loop, and "replaceStmt" which can be used to reorder the statements inside the loop. These are the methods you will use to generate the main output of the project. Because of the way the code rewriting architecture works, you MUST reorder the statements BEFORE inserting comments.

You will need to implement your graph data structures (classes) in "DepGraph.h", and algorithms in "depSolve.cpp". In "depSolve.cpp", you are given a function "dependent" which takes two subscript expressions, a lower bound, and an upper bound, and returns "false" if no dependence is found and "true" if independence cannot be proven. You will make use of this function in your code to construct the dependency graph. A skeleton for the function "generateDependencyGraph" is also provided, showing how to loop through the array reads and writes.

THIS IS NOT A GROUP PROJECT! Every student is expected to work on his/her own project. You may discuss overall design issues with your fellow students. Detailed discussions and/or code sharing is not allowed. The general rules for academic integrity apply.


Last updated by Ulrich Kremer at 4:15pm on December 3, 2011