I am writing a toy compiler, which compile a c/c like language to c . I am using bison, but in this structure is hard to handle when variable became out of scope.
In the source lanugage in the whole main function there can be only one variable with the same name, it is good, but there is a problem what I cannot solve.
I cannot make c code like this, because c compiler throw semantical error: 'var' was not declared in this scope.
int main()
{
if (true)
{
int var = 4;
}
if (true)
{
var = 5;
}
}
The source language has while, if and if/else statements. I have to throw semantical error if a declared variable is assinged in out of scope. For example:
This should be semantical error, so I cannot generetad this code:
int main()
{
while(0)
{
int var = 1;
}
if (1)
{
var = 2;
}
}
This also have to be semantical error:
int main()
{
if (0)
{
int var = 1;
}
else
{
if (1)
{
var = 5;
}
}
}
And this is allowed, I can generate this code:
int main()
{
if (0)
{
}
else
{
int var = 1;
if (1)
{
while (0)
{
while (0)
{
var = 2;
}
}
}
}
}
I tried lot of things, but I cannot solve when there is nested if, if/else or while.
I read tons of tutorials about symbol table, but none of them can explain properly how to manage a variable if it is become out of scope.
If you familiar with this topic and with bison, please do not just give me hints, like "use stack, and mark a variable if it become out of scope". I found lot of article about it. Instead of please give me pseudocode or concrate implementation sketch.
I think it cannot be so much difficult, because in the whole main function there can be one variable with the same name as I wrote.
Symbol table:
struct SymbolType
{
int lineNumber;
std::string identifier;
int identifierValue;
Type type;
int functionArgumentNumber;
Type functionReturnType;
Type markType;
bool outOfScope;
};
class Symbol
{
public:
void AddVariable(int _lineNumber, std::string _identifier, int _identifierValue, Type _type, int _functionArgumentNumber, Type _functionReturnType, Type _markType, bool _outOfScope);
void AddMarker(int _lineNumber, std::string _scopeName, Type _markType);
bool FindVariable(std::string _identifier);
int FindVariableValue(std::string _identifier);
void Remove();
void Print();
std::vector<SymbolType> symbolTable;
private:
int lineNumber;
std::string identifier;
int identifierValue;
Type type;
int functionArgumentNumber;
Type functionReturnType;
Type markType;
bool outOfScope;
};
CodePudding user response:
Now let's assume the following: While you are in a nested scope you cannot add a variable to a parent scope. So we can work e.g. with a stack like structure (push/pop at the end only suffices, but with read access to all entries – the latter requirement disqualifying std::stack
, so we'd operate e.g. on std::vector
instead).
- Encountering the declaration of a new variable:
Run up the entire stack to see if that variable exists already. If so, issue an error ('duplicate declaration/definition'). - Encountering accessing a variable: Run up the entire stack to see if that variable exists; if not, issue an error ('not declared/defined' – I wouldn't differentiate between the variable not having been defined ever or having left the scope).
- On leaving a scope, run up the stack and remove any variable that resides in that scope.
To be able to do 3. you have (at least) two options:
- With every stack entry provide an identifier for the respective scope, could be simple counter. Then delete all those variables that have the same counter value. If you fear the counter might overflow, then reduce it by 1 as well (then it would always represent current scope depths).
- Have a sentinel type – that one would be pushed to the stack on opening a new scope, compare unequal to any variable name, and on leaving a scope you'd delete all variables until you encounter the sentinel – and the sentinel itself.
This processing makes your outOfScope
member obsolete.
Your addVariable
function takes too many parameters, by the way – why should a variable need a return type, for instance???
I recommend adding multiple functions for every specific semantic type your language might provide (addVariable
, addFunction
, ...). Each function accepts what actually is necessary to be configurable and sets the rest to appropriate defaults (e.g. Type
to Function
within addFunction
).
Edit – processing the example from your comment:
while(condition)
{ // opens a new scope:
// either place a sentinel or increment the counter (scope depth)
int var = 1; // push an entry onto the stack
// (with current counter value, if not using sentinels)
if (true)
{ // again a new scope, see above
var = 2; // finds 'var' on the stack, so fine
} // closes a scope:
// either you find immediately the sentinel, pop it from the stack
// and are done
//
// or you discover 'var' having a counter value less than current
// counter so you are done as well
} // closing next scope:
// either you find 'var', pop it from the stack, then the sentinel,
// pop it as well and are done
//
// or you discover 'var' having same counter value as current one,
// so pop it, then next variable has lower counter value again or the
// stack is empty, thus you decrement the counter and are done again