Home > database >  Why does Java Compiler flip relational comparisons between primitive types? `i < j` => `i >
Why does Java Compiler flip relational comparisons between primitive types? `i < j` => `i >

Time:11-29

I'm implementing a deferred refinement operation for my numerical static analysis framework, specifically, integers and longs. For reference, int (and smaller) can be compared directly, no need for a cmp operation. However, I have noticed that the bytecode generated (at least for Java 11 targeting 1.8) for binary comparisons flips the comparison operation. For example, the source code may have if (i < j)... which is converted to a bytecode comparison of if_icmpge. For long types, the story is similar, but instead the comparison is punted to lcmp and simply use ifge to compare the result.

I suspect this is a cheap optimization but I would like to confirm. For example, by negating the condition, the false branch is now under a conditional jump instead of the true branch.

But the main question is: can I rely on this behavior of the compiler? If I'm given a conditional jump instruction with >= as the operator, can I always refine my abstract domain under the premise that the true branch is <?

Short aside: I'll be happy if someone can answer the behavior question. In writing this, I'm not sure it matters for me.

Edit

I seem to do this every time I'm in this section of the code base I'm working on.

The refinement must attach to the actual condition and not flip. Since, for example, the if_icmpge from the accepted answer is associated with a "branch out" block, the refinement should associate with the "branch out" values (e.g., no flip). Conversely, the "fall through" will be computed by "rotating" the comparison to its source original value.

Edit 2

I'm developing a numerical abstract interpretation framework, currently using intervals, zones, and a predicate based abstract domains. As such, what happens at runtime is does not particularly concern me.

(There's a lot to this and I'm undoubtedly leaving out details. I don't want the question to evolve into an introduction to data-flow static analysis.)

With respect to the proper interpretation of the flipped operators, my confusion was on the particular branch that was being flipped. Both the true and false branch of a condition are explored per this analysis. However, as noted and answered, the false branch is generated behind a conditional jump and the true branch simply falls through (the failure of the conditional jump). My question was predicated on missing the jump.

CodePudding user response:

Consider the following Java code:

class Test
{
    public static void main(String[] args)
    {
        System.out.println("a");
        if(args.length < 3)
        {
            System.out.println("b");
        }
        System.out.println("c");
    }
}

My Java compiler (javac 17.0.2) compiles this to:

public static void main(java.lang.String[]);
  Code:
     0: getstatic     #7                  // Field java/lang/System.out:Ljava/io/PrintStream;
     3: ldc           #13                 // String a
     5: invokevirtual #15                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
     8: aload_0
     9: arraylength
    10: iconst_3
    11: if_icmpge     22
    14: getstatic     #7                  // Field java/lang/System.out:Ljava/io/PrintStream;
    17: ldc           #21                 // String b
    19: invokevirtual #15                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
    22: getstatic     #7                  // Field java/lang/System.out:Ljava/io/PrintStream;
    25: ldc           #23                 // String c
    27: invokevirtual #15                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
    30: return

What you will notice here is that there is only a single branch, and it's the if_icmpge. That is, the control flow graph looks like this:

Basic blocks control flow graph

If this were an if_icmplt, then there would either have to be an unconditional branch after it, or the block that it jumps to would require an unconditional branch to jump back. This way, the conditional block can just "fall through" back to the shared execution path.

Whether this is a matter of optimisation or simplicity or both can be debated, but doing it the other way round would make the bytecode larger and more complex while likely not offering any advantage.

As to whether you can consider if_icmpge to be truly a <: if it simply skips a block that itself falls back on the shared execution path, then yes. But in the general case is depends on the control flow graph. If you have both an if and an else, either way round would be valid. If you have a chained if/else if/else, then it'd be nicer to decode it that way, but it'd still be valid to decode it as if { if/else } / else.

  • Related