Home > other >  Building a basic symbol table in C
Building a basic symbol table in C

Time:02-23

I am currently building a symbol table program using C. It needs to stay as simple as possible while having the required functionality as I am expected to produce a working compiler by the end of the semester. I currently have a working implementation that creates entries into the symbol table from user input but it is not 100% where it needs to be. I just need some guidance based on the feedback I was given from my professor. I understand that there are some things I need to change, I am new to coding in C and I am also trying to learn Python and R at the same time so im a little overwhelmed. I know I need a separate initialize and print function, That there should be no Input or Output in the create function, and that every entry has a scope of 0. where I'm stuck at, is creating the functions for initialize and print without losing the current functionality that I already have. Any help is appreciated. Here is my current implementation of the code:

#include <ctype.h>
#include <stdlib.h>
#include <string.h>

struct ADT {
    char name[18]; // lexeme name
    char usage; 
    char type; // I is integer, S is type string, I for identifier
    int scope; // scope of where it was declared, inserted for later use 
    int reference;   
};
typedef struct ADT new_type;
new_type table[200];

int i = 0;

int read(char *name, char usage, char type, char scope) {  //Read function to read input and check for duplicates
    for (int j = sizeof(table) / sizeof(table[0]); j >= 0; --j) {
        if (strcmp(table[j].name, name) == 0 &&
            table[j].usage == usage &&
            table[j].type == type &&
            table[j].scope == scope)
            return 1; // found
    }
    return -1; // not found! that's good
}

int create( char *name, char usage, char type, char scope) {  //Create function to insert new input into symbol table

    strcpy(table[i].name, name);
    table[i].usage = usage;
    table[i].type = type;
    table[i].scope = scope;
    if (table[i].usage == 'I' && table[i].type == 'L')
        table[i].reference = atoi(name);
    else
        table[i].reference = -1;

    return i  ;
}

int initialize(char *name, char usage, char type, char scope) { // Function to initialize the symbol table and clear it. also creates the fred lexeme
    

create("Fred", 'I', 'I', '0');


}

int print(char *name, char usage, char type, char scope) { // Print function to print the symbol table 

printf("Nate's Symbol Table\n");
    printf("#\t\tName\tScope\tType\tUsage\tReference\n");

    for (int j = 0; j < i; j  ) {
        if (table[j].name == NULL)
            break;

        printf("%*d\t\t%*s\t%*d\t%*c\t%*c\t%*d\n", j, table[j].name, table[j].scope, table[j].type, table[j].usage, table[j].reference);
    }


}

int main() { // Main function to take input and produce the symbol table lexemes 
    printf("Course: CSCI 490 Name: Nathaniel Bennett NN: 02 Assignment: A03\n");
    printf("\n");
    create("Fred", 'I', 'I', 0);

     for (int j = 0; j < i; j  ) {
        if (table[j].name == NULL)
            break;
         printf("#\t\tName\tScope\tType\tUsage\tReference\n");
         printf("%*d\t\t%*s\t%*d\t%*c\t%*c\t%*d\n", j, table[j].name, table[j].scope, table[j].type, table[j].usage, table[j].reference);
          }

    // keep asking for a lexeme until we type STOP or stop
    while (1) {
        char lexeme[256];
        char nUsage;
        char nType;
        char nScope;

        printf("Enter a lexeme: \n"); //enter lexeme name
        scanf("%s", lexeme);

        if (strcmp(lexeme, "stop") == 0) break;
        
        printf("Enter its usage: \n");
        scanf(" %c", &nUsage);

        printf("Enter its type: \n");
        scanf(" %c", &nType);

        printf("Enter its scope: \n");
        scanf(" %c", &nScope);

        printf("%s, %c, %c, %c\n", lexeme, nUsage, nType, nScope);
        create(lexeme, nUsage, nType, nScope);

          for (int j = 0; j < i; j  ) {
        if (table[j].name == NULL)
            break;
         
         printf("%*d\t\t%*s\t%*d\t%*c\t%*c\t%*d\n", j, table[j].name, table[j].scope, table[j].type, table[j].usage, table[j].reference);
          }
    }

    printf("Nate's Symbol Table\n");
    printf("#\t\tName\tScope\tType\tUsage\tReference\n");

    for (int j = 0; j < i; j  ) {
        if (table[j].name == NULL)
            break;

        printf("%*d\t\t%*s\t%*d\t%*c\t%*c\t%*d\n", j, table[j].name, table[j].scope, table[j].type, table[j].usage, table[j].reference);
    }


    return 0;
}```

CodePudding user response:

...I think we're normally reluctant to get up in people's course assignments, but you seem like you have thought about this for a while Nate.

I can't quite make out what your instructor is suggesting. I do not see I/O in your code for the create() function. Unless the call to strcpy() is considered I/O in their view.

I do see some room for improvement in your print() function though. Your function relies upon a global entity (table) and then it ties your loop both to an imaginary value (what is "i" in your loop initialization?) AND to a condition where your logic asks effectively, "did I run out of table?"

Choose one condition or the other. There is a semantic elegance in simply printing everything you find in the table. You can make the function better if you pass a reference to the table rather than code to the existence of a static global value. So instead of passing all those values to your print() function, how about just one argument? Pass a reference to table, and your function could then be used for other similar dump operations. It becomes more generalized, and that's a good thing.

I would also say this. I prefer using sprintf() to stage my output in a string and then when everything is ready, I output it all at one time. This is easier to inspect and debug.

Also, not related to your assignment I imagine, but be extra-vigilant every time you use scanf() -- it was often my number one suspect whenever I had a bad pointer.

Definitely try to isolate or eliminate calls to chaotic functions like that one.

Keep thinking about how to make your function stronger, keep refactoring. You'll do great!

CodePudding user response:

There are a number of issues. This won't even compile:

  1. read conflicts with the syscall (i.e. rename it)
  2. read has UB (undefined behavior) because it starts the for loop at one beyond the end of the table array
  3. The symbol printing code is replicated everywhere. Better to define a table printing function (e.g. tblprint) and a symbol printing function (e.g. symprint).
  4. The format used to print a symbol uses (incorrectly) variable precision format specifiers (e.g.) %*s expects two arguments: int len,char *str With -Wall as a compile option, these statements are flagged.
  5. AFAICT, ordinary format specifiers work fine.
  6. The if (sym->name == NULL) will never be valid because it is a fixed length array. We need to use a char *.
  7. Using i as a global for the count of the array is misleading. Try something more descriptive (e.g.) tabcount
  8. Using table[i].whatever everywhere is cumbersome. Try using a pointer (e.g. sym->whatever)
  9. initialize [and some others] need a return with a value.

I've used cpp conditionals to denote old code vs new code:

#if 0
// old code
#else
// new code
#endif

Here is the refactored code. It is annotated. It compiles cleanly and passes a rudimentary test:

#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>

struct ADT {
// NOTE/BUG: the if (sym->name == NULL) will fail
#if 0
    char name[18];                      // lexeme name
#else
    const char *name;                   // lexeme name
#endif
    char usage;

    // I is integer, S is type string, I for identifier
    char type;

    // scope of where it was declared, inserted for later use
    int scope;
    int reference;
};

#if 0
typedef struct ADT new_type;
new_type table[200];
#else
typedef struct ADT ADT;
ADT table[200];
#endif

int tabcount = 0;

// NOTE/BUG: "read" conflicts with a syscall name
#if 0
//Read function to read input and check for duplicates
int
read(char *name, char usage, char type, char scope)
#else
// find_entry -- find a matching entry (if it exists)
int
find_entry(char *name, char usage, char type, char scope)
#endif
{

// NOTE/BUG: this is UB (undefined behavior) because you're starting at one
// past the end of the array
#if 0
    for (int j = sizeof(table) / sizeof(table[0]); j >= 0; --j) {
#else
    for (int j = tabcount - 1; j >= 0; --j) {
#endif
        ADT *sym = &table[j];
        if (strcmp(sym->name, name) == 0 &&
            sym->usage == usage &&
            sym->type == type &&
            sym->scope == scope)
            return 1;
    }

    // not found! that's good
    return -1;
}

//Create function to insert new input into symbol table
int
create(char *name, char usage, char type, char scope)
{
    ADT *sym = &table[tabcount];

// NOTE/BUG: this needs to be a pointer to a string to allow long strings and
// for "if (sym->name == NULL)" to be valid
#if 0
    strcpy(sym->name, name);
#else
    sym->name = strdup(name);
#endif
    sym->usage = usage;
    sym->type = type;
    sym->scope = scope;
    if (sym->usage == 'I' && sym->type == 'L')
        sym->reference = atoi(name);
    else
        sym->reference = -1;

    return tabcount  ;
}

// Function to initialize the symbol table and clear it. also creates the fred
// lexeme
int
initialize(char *name, char usage, char type, char scope)
{

    create("Fred", 'I', 'I', '0');

    return 0;
}

void
symprint(ADT *sym)
{
    int j = sym - table;

// NOTE/BUG: with (e.g) %*d this is variable precision field -- it requires
// _two_ arguments: <int wid>,<int val>
#if 0
    printf("%*d\t\t%*s\t%*d\t%*c\t%*c\t%*d\n",
        j, sym->name, sym->scope, sym->type,
        sym->usage, sym->reference);
#else
    printf("%d\t\t%s\t%d\t%c\t%c\t%d\n",
        j, sym->name, sym->scope, sym->type,
        sym->usage, sym->reference);
#endif
}

void
tblprint(int title)
{

    if (title)
        printf("#\t\tName\tScope\tType\tUsage\tReference\n");

    for (int j = 0; j < tabcount; j  ) {
        ADT *sym = &table[j];
        if (sym->name == NULL)
            break;
        symprint(sym);
    }
}

// Print function to print the symbol table
int
print(char *name, char usage, char type, char scope)
{

    printf("Nate's Symbol Table\n");
    tblprint(1);

    return 0;
}

// Main function to take input and produce the symbol table lexemes
int
main()
{
    printf("Course: CSCI 490 Name: Nathaniel Bennett NN: 02 Assignment: A03\n");
    printf("\n");
    create("Fred", 'I', 'I', 0);

    tblprint(1);

    // keep asking for a lexeme until we type STOP or stop

    while (1) {
        char lexeme[256];
        char nUsage;
        char nType;
        char nScope;

        // enter lexeme name
        printf("Enter a lexeme: \n");
        scanf("%s", lexeme);

        if (strcmp(lexeme, "stop") == 0)
            break;

        printf("Enter its usage: \n");
        scanf(" %c", &nUsage);

        printf("Enter its type: \n");
        scanf(" %c", &nType);

        printf("Enter its scope: \n");
        scanf(" %c", &nScope);

        printf("%s, %c, %c, %c\n", lexeme, nUsage, nType, nScope);
        create(lexeme, nUsage, nType, nScope);

        tblprint(0);
    }

    printf("Nate's Symbol Table\n");
    tblprint(1);

    return 0;
}
  • Related