Home > database >  Question regarding unreachable statement errors
Question regarding unreachable statement errors

Time:10-12

Given this snippet of code, containing a class TestClass with its method loopTest(), which is our main focus:

public class TestClass{
   public void loopTest(){
      int x = 0;
      loop: for (int i = 1; i < 5; i  ){
         for (int j = 1; j < 5; j  ){
            System.out.println(i);
            if (x == 0) {  continue loop;  }
            System.out.println(j);
         }
      }
   }
}

Running this code in IntelliJ IDEA doesn't return any compilation errors, but successfully returns an output. My question would be: why isn't this an unreachable statement?

Running the loop after removing the if (x == 0) statement (but keeping the continue loop; as seen below:

loop: for (int i = 1; i < 5; i  ){
         for (int j = 1; j < 5; j  ){
            System.out.println(i);
            continue loop;
            System.out.println(j);
         }
      }

(...) would return such an error, because the last sout would never be printed. Why doesn't this also happen in the first case? I'm pretty confident this isn't some runtime problem, because the compiler definitely sees that integer x is initialized at the beginning of the method, but the loop never reaches the second statement.

CodePudding user response:

The java language spec defines a concept called 'reachability'. This has nothing to do with whether a statement is actually reached or not reached, as these are all compile-time checks - the compiler decides, and given that the compiler decides, it does not just get to run the code and see.

These aren't 'best effort' nor are these reachability rules casually changed between releases. Essentially, certain definitions were laid out in java 1.0 and they have intentionally never changed.

Reachability's effects are as follows:

  1. Any uninitialized local variable cannot be used unless the compiler determines that the line that reads the variable cannot possibly be reached without necessarily having earlier reached a line that definitely sets that variable. This is called 'definite assignment'.
  2. A method need not return at all if reachability dictates it never ends. public int foo() { while (true) ; } compiles even though you have no return statement in it.
  3. Unreachable code is a compiler error.

But all of that in context of the specific rules about determining this stuff, as laid out by the java lang spec.

In particular:

  1. Only Compile-time-constant (CTC) expressions are resolved in the first place. CTCs are their own can of worms. obviously, actual constants in a source file are CTC, but generally any reference to a static final field of a type that can be constant (Strings and primitives only), whose initializing expression is a CTC, is itself also a CTC. So, public static final long x = 1; means x is a CTC, but public static final long x = System.currentTimeMillis(); means x is not. Curiously, null is not CTC. Why? Cuz spec says so, really. Certain arithmetic is CTC if all its units are CTC. 5 2 is CTC, for example. So is x 2, if x is CTC.
  2. If the expression in a while or for condition is CTC false, then the compiler takes that into account. For all purposes:
class Test {
  public static final boolean TEST = true;

  public int  test() {
    while (TEST) ;
  }
}

compiles, even though you'd think it should not, due to test() missing a return statement. However, this doesn't:

class Test {
  public static final boolean TEST = Boolean.TRUE;

  public int  test() {
    while (TEST) ;
  }
}

Why? Because Boolean.TRUE is not CTC, and thus TEST is not CTC. The fact that Boolean.TRUE is true and the compiler can pretty easily figure that out does not matter. The spec says that this isn't a CTC expression, therefore, it isn't. How reasonable it is to expect the compiler to figure it out doesn't factor into it, at all. Compiler has to follow the spec, period. Reachability analysis isn't a 'nice to have linting thing'. It's crucial to the language itself.

  1. Specifically excluded is if (false) statement;. Even though statement; is obviously not reachable, this is nevertheless allowed, specifically to support an ersatz #ifdef-style compile-time exclusion. In fact, the compiler will entirely exclude the code, even from the class file, if you do this. This is no longer popular (IDEs let you select a block and hit one key to remark it all out), but it used to be, and java doesn't break backwards compatibility unless there's a good reason. In other words, the compiler is perfectly well aware that the stuff is unreachable (it has to be, because by spec it must not even include that code in the class file!), but the spec says this is allowed.

This explains why e.g. if (1 == 0) System.out.println(); compiles fine, but while (1 == 0) System.out.println(); is a compiler error. Not because the compiler is stupid, or because java is 'old' and somehow the OpenJDK team hasn't bothered to touch the reachability analyser in 25 years. But because the spec says this is how it's supposed to work, for good reasons (perhaps those 'good reasons' now boil down to: It has always worked that way and changing it is backwards incompatible, but that's still a good reason).

Here's an example of javac in action:

> cat Test.java
class Test {
  public void test() {
    if (false) System.out.println();
  }
}
> javac Test.java
> javap -c -v Test # Show the bytecode of the compiled Test class
 [ ... I removed a whole bunch of irrelevant parts of the decompile ... ]
  public void test();
    descriptor: ()V
    flags: (0x0001) ACC_PUBLIC
    Code:
      stack=0, locals=1, args_size=1
         0: return
      LineNumberTable:
        line 4: 0
}
SourceFile: "Test.java"

Note how that System.out.println() call is gone.

NB: At some near future java release, the notion of CTCs may change and encompass more. The general aim of various skunkworks OpenJDK projects is making e.g. LocalDate.of(2022, 10, 1) be a CTC. Probably with slightly different definitions (to keep backwards compat), so I doubt this will affect reachability analysis rules. The current java release doesn't have any of that, so this answer is good up to a least JDK19.

CodePudding user response:

See Answer by rzwitserloot - it's actually because Java deliberately excludes if statements from deeper analysis since it's a useful feature.

For reference, the answer which is based on incorrect assumptions:


This is because the compiler has limits on how deep the analysis goes.

if (false) System.out.println("never");

is fine for example because these conditions aren't checked. The compiler assumes that conditionals can be true even if that's impossible in practice.

Code after an unconditional continue, return, break are known to be unreachable without further analysis.

Newer languages can do that kind of analysis and I think tools like SonarLint could find that as well. Even Java at runtime might optimize that code away but I guess the age / state of static code analysis at the time when Java was formalized is responsible for this behavior.

The if (false) trick is also a fairly common way to disable code for quick testing so adding this now could break a lot of code and I guess that makes it unlikely to get added as a language / compiler feature although it's perfectly doable nowadays.

  • Related