Consider the following:
interface ISomething
{
void Call(string arg);
}
sealed class A : ISomething
{
public void Call(string arg) => Console.WriteLine($"A, {arg}");
}
sealed class Caller<T> where T : ISomething
{
private readonly T _something;
public Caller(T something) => _something = something;
public void Call() => _something.Call("test");
}
new Caller<A>(new A()).Call();
Both the call to Caller<A>.Call, as well as its nested tcall to A.Call are lodged through the callvirt instruction.
But why? Both types are exactly known. Unless I'm misunderstanding something, shouldn't it be possible do use call rather than callvirt here?
If so - why is this not done? Is that merely an optimisation not done by the compiler, or is there any specific reason behind this?
CodePudding user response:
You're missing two things.
The first is that callvirt
does a null-check on the receiver, whereas call
does not. This means that using callvirt
on a null
receiver will raise a NullReferenceException
, whereas call
will happily call the method and pass null
as the first parameter, meaning that the method will get a this
parameter which is null
.
Sound surprising? It is. IIRC in very early .NET versions call
was used in the way you suggest, and people got very confused about how this
could be null
inside a method. The compiler switched to callvirt
to force the runtime to do a null-check upfront.
There are only a handful of places where the compiler will emit a call
:
- Static methods.
- Non-virtual struct methods.
- Calling a base method or base constructor (where we know the receiver is not
null
, and we also explicitly do not want to make a virtual call). - Where the compiler is certain that the receiver is not null, e.g.
foo?.Method()
whereMethod
is non-virtual.
That last point in particular means that making a method virtual
is a binary-breaking change.
Just for fun, see this check for this == null
in String.Equals
.
The second thing is that _something.Call("test");
is not a virtual call, it's a constrained virtual call. There's a constrained
opcode which appears before it.
Constrained virtual calls were introduced with generics. The problem is that method calls on classes and on structs are a bit different:
- For classes, you load the class reference (e.g. with
ldloc
), then usecall
/callvirt
. - For structs, you load the address of the struct (e.g. with
ldloc.a
), then usecall
. - To call an interface method on a struct, or a method defined on
object
, you need to load the struct value (e.g. withldloc
), box it, then usecall
/callvirt
.
If a generic type is unconstrained (i.e. it could be a class or a struct), the compiler doesn't know what to do: should it use ldloc
or ldloc.a
? Should it box or not? call
or callvirt
?
Constrained virtual calls move this responsibility to the runtime. To quote the doc above:
When a
callvirt
method
instruction has been prefixed byconstrained
thisType
, the instruction is executed as follows:
- If
thisType
is a reference type (as opposed to a value type) thenptr
is dereferenced and passed as the 'this' pointer to thecallvirt
ofmethod
.- If
thisType
is a value type andthisType
implementsmethod
thenptr
is passed unmodified as the 'this' pointer to acall
method
instruction, for the implementation ofmethod
bythisType
.- If
thisType
is a value type andthisType
does not implementmethod
thenptr
is dereferenced, boxed, and passed as the 'this' pointer to thecallvirt
method
instruction.This last case can occur only when
method
was defined onSystem.Object
,System.ValueType
, orSystem.Enum
and not overridden bythisType
. In this case, the boxing causes a copy of the original object to be made. However, because none of the methods ofSystem.Object
,System.ValueType
, andSystem.Enum
modify the state of the object, this fact cannot be detected.