Home > Software design >  Mechanism of type erasure in Java
Mechanism of type erasure in Java

Time:03-28

I've read Oracle docs on Generics, and some reference books and i still cannot grasp some of the things about Java type erasing. First of all why aren't we allowed to say :

public class Gen<T> {
    
    T obj = new T();

    public T getObj() {
        return obj;
    }

    public void setObj(T obj) {
        this.obj = obj;
    }
}

Why doesnt Java allow me to say new T()? I understand that memory allocation for object of type T is allocated at runtime and type erasure is done in compile time, but when the type erasure is done, all of my T's will be replaced with Objects, so why is this a big deal?

Also how is this type of manipulation with T[] possible : T[] arr = (T[]) new Object[size];

I just cant wrap my head around this things.

Thanks in advance.

I expected for it to create Object obj = new Object(), and to give me type safety throught the code, like inserting element, or extracting it with some getter. I dont understand why is this not allowed even with type erasure?

CodePudding user response:

All of my T's will be replaced with Objects, so why is this a big deal?

Because T can be something other than Object.

class Gen<T> {
   public T obj;
   public Gen() { obj = new T(); /* illegal */ }
   public Gen(T t) { obj = t; /* legal */ }
   // getters and setters are unnecessary complications for this example
}

Gen<Integer> g = new Gen<Integer>();
Integer i = g.obj; // should be safe, but you would make it unsafe
i = i   5; // uh oh

Gen<Integer> h = new Gen<Integer>(0);
Integer j = h.obj;
j = j   5;

Type erasure is meant to remove generics while keeping the program the same, in the sense that if you ran the program without doing erasure you would get the same results. When this program is interpreted without erasure, i is an Integer. If we followed your method of type erasure, it would instead get assigned with an Object. So your way of doing it is wrong. Further, since new T() needs to know what T is to work, but erasure removes all runtime knowledge of T, there is in fact no way to compile new T(); while doing erasure, so it's banned. In contrast, the non-erased and erased versions of the h and j sequence do the same operations, so those are allowed.

The thing with the array is a hack and doesn't actually create a T[].

<T> T[] hack(int n) { return (T[])new Object[n]; }
Integer[] is = hack(5); // runtime error

Unchecked casts like (T) or (T[]) are where Java compromises on the "same-behavior" property of erased programs. A non-erased program would fail in hack because the cast would fail. The erased program can't actually perform the cast, so hack succeeds, and the failure is in the variable assignment. As long as an incorrectly cast object is not passed anywhere where the actual type is known, nothing goes wrong. It becomes your responsibility to maintain type safety. The above function, for example, fails to do that. The following example class does it correctly.

class SmallLIFO<T> {
    private T[] buf = (T[])new Object[10]; // take responsibility for maintaining type safety
    private int used = 0; // the Object[]-pretending-to-be-a-T[] is never given to the user, who may know what a T is and expose the lie
    public boolean push(T t) { // this class's public interface only operates on objects that are the right type
        boolean ret = used < 10;
        if(ret) buf[used  ] = t;
        return ret;
    }
    public T pop() {
        return used > 0 ? buf[--used] : null; // we'd either need a cast to (T[]) in buf or a cast to (T) here; no avoiding it
    }
}

CodePudding user response:

You seem to be saying that since new T() should all be replaced with new Object(), which is a perfectly valid constructor to call. Indeed this is true, but is that the intention of "new T()"?

The purpose of new T() is of course not to create a new Object instance, but to create a new instance of T, whatever that may be. And it is exactly because the JVM doesn't know what T is, that it is impossible to create an instance of T.

Suppose that Java works the way you said it would, and changed all new T() to new Object(), and you have:

public class Foo {
    private int x = 10;

    public Foo() { System.out.println("Hello"); }

    public static <T> T magicallyCreateT() { 
        return new T(); 
    }

    public int getX() { return x; }
}

What would a reasonable person expect if I did this?

Foo foo = Foo.magicallyCreateT();
System.out.println(foo.getX());

From a type-checking perspective, that snippet looks completely normal, doesn't it?

They would expect Hello to be printed, and foo.getX() to return 10, wouldn't they? But the truth is, since the Object constructor is called, not Foo's, no Hello is printed, and since magicallyCreateT returns an instance of Object, you wouldn't even able to call getX on foo! There's no getX method in the Object class! I'd imagine the program would throw a ClassCastException at runtime.

So you see there are lots of problems if you just "create an Object", when you say "I want to create a T", so it is not allowed to do things like new T().

For the case of (T[])new Object[], it is different. You are explicitly saying that you are creating an Object[], and you are casting it to T[]. In the same way, you can also do (T)new Object(). In both cases, you'd get a ClassCastException if something goes wrong later down the line, like the scenario above. In the same way that you can't do new T(), you can't do new T[] either!

Whenever you're casting with a type parameter like this, you're basically telling the compiler that "trust me, I know what I'm doing".

  •  Tags:  
  • java
  • Related