We move on to the final part of the project, namely the semantic analyzer for C-. As usual, the directions below are meant to help you.
traceScan
and traceParse set to FALSE, but
echoSource and traceAnalyze set to
TRUE.
Happy programming!
There are three new fields necessary in the syntax tree:
int nestLevel;
TreeNode * defptr;
CMType datatype;
We describe the use of each of these three fields.
The nestLevel field is needed to separate nested
scopes. In the semantic analyzer, this will be used to check whether
a redeclaration of an identifier occurs within the same scope, since
this is a semantic error:
void f(int x)
{ int y;
int y[10]; /* error! */
}
Redeclarations across scopes do not represent errors:
void f(int x)
{ int y;
{ int y[10]; /* no error here */
}
}
You can check this by recording the nestLevel of each
declaration. In the first example above f is at
nestLevel 0, x is at nestLevel
1 and y is at nestLevel 2. In the
second example, the second y is at nestLevel 3. This
double increment of nestLevel is due to a quirk of C
that wants parameters to be viewed as being at a different
nestLevel than the function name and the local
variables. For example, the following code is not an error in
C:
void f(int x)
{ int x; /* no error here */
}
Thus, you should increment nestLevel twice when
entering a function: once before processing the parameters, and once
before processing the body block. Then you can use
nestLevel alone to determine a redeclaration error.The
best way to record the nestLevel is to use a static
external variable within the analyze.c file, initialized
to 0, and incremented/decremented accordingly.
This field is used for any tree node that stores an identifier name. It points to the definition/declaration of the function/parameter/variable that it refers to. This allows type checking to occur after the symbol table has been disposed of.
This field records the type of the node during type checking. In C-Minus there are essentially five types to be checked, and we can define these in an enum declaration:
Typedef enum {IntType, ArrayType, IntFuncType, VoidFuncType,
VoidType } CMType;
All DefNodes and ExpNodes in the syntax
tree will have types. No StmtNode will have type. Type
errors and call nodes to functions of type VoidFuncType
will both have VoidType.
I recommend that you use the same basic structure as with the TINY
semantic analyzer, namely, that you use the traverse
function for your basic tree traversal, and that you build the symbol
table in the first pass, and perform type checking in the second
pass. Thus, the nestLevel and defptr fields
will get filled in during the symbol table pass, while the
datatype fields will get filled in during the type
checking pass.
The preorder and postorder functions will be somewhat more
complicated in the C-Minus case, and you may want to write separate
procedures for defnodes, stmtnodes, and expnodes, which can be called
by these functions. Though the symbol table pass is essentially a
preorder traversal, there are a couple of operations that need
to be performed in postorder: calling exitScope() and
decrementing nestLevel when exiting a block or function
definition. Thus, postProc, though simple, is not the
nullProc.
I propose keeping the same basic structure as in TINY, except of
course that provision for nested scopes must be added. I suggest you
add pointers to separate tables to the internal structure of the
table, as shown in Figure 6.17 on page 305 of the text. You create a
new table for a new scope using a new symbol table primitive
enterScope(), and you delete the new table using a new
primitive exitScope(). There should be no other new
function declarations necessary in symtab.h. Also, the
printTable() primitive should be removed (see the
comments on tracing information below). With this design, the symbol
table can remain hidden from the rest of the compiler, which is a
good feature.
One essential question: what should the symbol table store? The
most appropriate and versatile way is to store the TreeNode *
value of the nodes containing the names in the table. These
node will contain all the important information, like the
nestLevel, the lineno, and even the name
string itself.
There is one additional problem, namely the existence of the
predefined procedures input() and output()
in C-Minus. The way to handle these is to build syntax tree nodes for
them by hand, and insert them into the symbol table as part of the
symbol table initialization step.
IntType, even in
the presence of errors. This will limit the number of error
messages printed on a single type error.
IntType or ArrayType, and
ArrayType can only be a single identifier matching an
array parameter. The result type of a function call is either
IntType or VoidType, depending on
whether the function is IntFuncType or
VoidFuncType.
IntFuncType. But how can we
tell when we reach a Return node what function we are in? The best
way to solve this problem is, like nestLevel, to have
a static external TreeNode * CurrentFunc pointer that
can be consulted at every return node. This pointer should of
course point at the definition node of the current function.
There is one place where postorder constructions of types does not
quite work: in recursive functions. Before entering a function body,
the CMType of the function must be filled in, in case it
is called within its body. This can be accomplished by using a very
simple preorder function (but not nullProc) as a
parameter to the traverse function.
There is one other preorder operation that needs to be performed
during the type checking pass: assigning the currentFunc
pointer.
This is a topic we did not discuss in class. What should our
semantic analyzer print if traceAnalyze is set to
TRUE? The TINY compiler just calls
printTable(), which we have said should not exist for
C-Minus. Here is my recommendation: