How to implement a nested if statement using branches and labels?-CodePudding

I am implementing the functionality for if statements in a basic compiler. My initial approach was to use labels for each condition so for instance:

if(x==1){
1. do this
}
else{
2. do this
}

would convert to

jump to 2 if not equals to 1
1. do this
jump to end
2. do this
end

However this approach does not work for nested or multiple if statements because the labels are reused, and assembly language doesn't have any such thing as an indentation. How would nested or multiple ifs be handled in assembly language? Should I just generate new labels every time?

CodePudding user response：

If, in the same compilation, you compile two independent functions each with its own if-statement, without renaming the labels, the result won't even assemble in some assemblers. It is an issue broader than nesting; the mere presence of another if-statement later in the same function, or even in another function can cause label conflicts.

The labels are an integral part of the translation of structured statements to the if-goto-label form, and each label needed by the pattern must be conjured uniquely, every time the translated pattern is applied to a structured statement, or else the patterns just don't work. Each if-goto-label pattern has some number of labels and each particular label has uses and definitions that must bind to each other (and to nothing else) to make the patterns work.

It is a bit like Free variables vs. Bound variables in logic statements. When combining two otherwise separate formulas/statements, sometimes variable names conflict, so one of them has to be renamed before combining them.

You may find a scheme in which you can rely on a local labels feature in some assemblers, but the simplest solution is to generate new label names (usually using numbering) for each structured statement translation in the compilation unit. The numbering can start (e.g. at 1) at the beginning of the translation unit, but is not usually even reset between functions.

In pseudo code, here's what I do. There's an interface on a code gen context object, which allows conjuring and placing labels. With this, we know the binding of labels to their usages, and without changes here, various labeling scheme can be supported, whether fully unique, unique to a procedure, or perhaps local labels.

void codeGenIfStatment () {
    var else_label = context.GetNewLabel ();

    // generates code to branch to else when the condition is false
    generateBranch ( this.condition, else_label, false );

    //   and falls through to then part otherwise (condition true)
    generateCode ( this.then_part );

    var end_label = context.GetNewLabel ();
    generateBranch ( end_label ); // unconditional branch

    context.PlaceLabelHere ( else_label );
    generateCode ( this.else_part );

    context.PlaceLabelHere ( end_label );
}

CodePudding user response：

Using NASM you can use macro-local or context-local labels to allow nesting control structures. You can either make multi-line macros for creating labels, or just manually push a context and create context-local labels, as shown in my example. Using this method you can re-use the same label names in different functions in the same assembly unit, or even within nested conditional constructs as long as they are nested in a way you can nest contexts in NASM. (That is, using %push, %repl, and %pop. Here's the manual on contexts.)

Example:

%push
        cmp ax, 4
        jne %$if_ax_not_4
        ; IF 1
%$if_ax_4:
 %push
        test bx, bx
        js %$if_bx_negative
        ; IF 2
 %$if_bx_positive:
        ; ...
        mov cx, 1
        jmp %$endif     ; IF 2 endif

        ; ELSE 2
 %$if_bx_negative:      ; (could be just %$else)
        ; ...
        mov cx, 2
 %$endif:
 %pop
        jmp %$endif     ; IF 1 endif

        ; ELSE 1
%$if_ax_not_4:          ; (could be just %$else)
        ; ...
        mov cx, 3

%$endif:
%pop