Home > Blockchain >  JAVA: How an Object's methods are used and stored
JAVA: How an Object's methods are used and stored

Time:03-26

My question is for Java.
Question: Lets say in my main method I have the line of code LinkedList<E> myLinkedList = new LinkedList<>() so now I have a reference/pointer variable named myLinkedList1 to an object (and the constructor is stored in its own LinkedList class in another .java file not the same .java file the main method is in). And now I make another reference/pointer variable named myLinkedList2. I use the method addLast(E newElement)(This method is stored in the LinkedList class of course), but I only use it on myLinkedList1(so it is myLinkedList.addLast(E newElement)), how does the JVM know to use this method only on myLinkedList1 and not myLinkedList2, are the objects methods stored in the heap with it? I thought they were put on the stack.
If I need to make any clarifications please let me know!

CodePudding user response:

You can think of the object you're calling a method on (i.e. the thing before the .) as an extra argument to the function. So, conceptually, you can think of myLinkedList1.addLast(elt) as being somewhat like

LinkedList.addLast(myLinkedList1, elt)

so the "invocant" is an additional piece of information passed to the method. Some languages make this explicit. For instance, in Lua, foo:bar(1) is exactly equivalent to foo.bar(foo, 1), and in Python foo.bar(1) is roughly equivalent to Foo.bar(foo, 1). But in Java, this all happens in the background and is a bit more complicated, but conceptually it's the same idea.

CodePudding user response:

An object, in memory, contains the following information:

  • A pointer that points at the actual class that this object is an instance of.
  • Enough room for all the fields. Given that java is reference based, every field is at most 64 bit - they're all fixed size, so this is not complicated.
  • Other stuff which isn't relevant for your question.

Crucially they do not contain any methods, at all.

I thought they were put on the stack.

Methods? On the stack? That makes no sense. You must be misinformed. Methods aren't on the stack. They aren't really in heap either. They live as singletons in the class definition which is loaded only once for any class. On modern JVMs, those technically do live in the heap, but, crucially, nowhere near the heap space dedicated to storing your object. They live in the heap space dedicated to storing the definition (the bytecode, or rather, the transformed, hotspotted, etc bytecode) of the class. No matter how many LinkedList instances you make, there's only one LinkedList class, so 1 million LinkedList instances still means you only have the actual body content of the addLast method stored once in memory. Yes, addLast is an instance method. There's still only one copy of it in memory (unlike instance fields; each instance has its own copy of each non-static field).

Any given class is loaded at most once for the entire JVM (why load it more than once? These things are constant, it'd be a waste of memory). A class contains all methods (instance and static).

In fact, there is, as far as a method is concerned, no different at all between static and non-static methods as far as the JVM is concerned. An instance method simply has as first argument its 'receiver' - for example, String's toLowerCase() method is a method that takes 1 argument, of type String. There is very little difference between:

public String toLowerCase() {
  return this.doTheThing();
}

and

public static String toLowerCase(String in) {
  return in.doTheThing();
}

So, when you write, in java, foo.bar();, you get 2 unrelated steps: First, javac turns it into bytecode, which is stored in a class file. Then, 5 days later on a completely different machine, someone runs your class file and then the JVM sees the bytecode and runs it.

javac first tries to figure out which precise bar() you are calling there, by checking what the type of foo is. Once javac figures that out, you end up with the bytecode:

INVOKEVIRTUAL com.pkg.FullTypeOfWhateverFooIsThere :: bar :: ()V

That third bit is the 'signature' (the parameter types and return types, which in java are an inherent part of a method's identity). That's it - the arguments to all things are on the stack. This particular method has one argument (the receiver - something that is an instance of com.pkg.FullTypeOfWhateverFooIsThere), which will have to be on the stack. javac ensures it is true. The JVM checks the bytecode and if it can't confirm it is true, it will reject the class file with a VerifierError (this cannot happen unless you manually mess with the bytecode, or have a corrupt disk).

Then, the JVM 'follows the pointer' and checks what that first argument's actual type is, and will then find the loaded class that represents that precise type. It then checks that class (and not com.pkg.FullTypeOfWhateverFooIsThere - at least, not if the actual class is a subclass) for a method named foo with signature ()V. If it finds it, it runs it. If it doesn't, it goes up one class in the hierarchy and keeps looking for foo::()V until it finds it (which it will, otherwise your code wouldn't have compiled in the first place).

When the code for addLast runs, there are 2 things on the stack as that method begins execution:

  1. An instance of LinkedList or some subclass thereof.
  2. A new element of type Object. (at the JVM level, generics are erased).

The method can do its work just fine with that; LinkedLists have fields that store this data, the code of addLast will interact with these fields in order to do what its javadoc says it should do. Specifically, a LinkedList has a 'head' field that points at a node that contains a reference to the object (that'll be the first object in the list) and a pointer to another node. The addLast code keeps looping, grabbing the 'next pointer', until the next pointer is null. At that point it makes a new Node object, sets its 'value' to that second thing on the stack, and then updates the 'next' pointer of the last visited node to point at this newly created one, and then it is done.

Hence:

  • All that the JVM needs to 'find' the addLast code, is to know which method was intended (which is in the bytecode), and a pointer to the singleton 'loaded class' in memory for the actual type that is top-of-stack (well, below the parameters) when the INVOKEVIRTUAL command is called, which is easy, as all objects have a pointer to it, so that's just a matter of looking it up. Thus, the JVM can execute addLast.

  • All that the addLast code needs to know which list to operate on, is.. the list. The receiver is passed as first parameter: foo.addLast(elem) ends up being called with the stack having first foo and then elem on it. This'd be no different from a static method having signature (addLast(LinkedList<E> list, E elem)) - any invocations to such a method would have 2 things on stack as well when they begin execution (static methods don't have a receiver, they just have their params on stack).

  • Related