Why is callvirt used to call a method on a readonly field of generic type-CodePudding

Consider the following:

interface ISomething
{
    void Call(string arg);
}

sealed class A : ISomething
{
    public void Call(string arg) => Console.WriteLine($"A, {arg}");
}

sealed class Caller<T> where T : ISomething
{
    private readonly T _something;
    public Caller(T something) => _something = something;
    public void Call() => _something.Call("test");
}

new Caller<A>(new A()).Call();

Both the call to Caller<A>.Call, as well as its nested tcall to A.Call are lodged through the callvirt instruction.

But why? Both types are exactly known. Unless I'm misunderstanding something, shouldn't it be possible do use call rather than callvirt here?

If so - why is this not done? Is that merely an optimisation not done by the compiler, or is there any specific reason behind this?

CodePudding user response：

You're missing two things.

The first is that callvirt does a null-check on the receiver, whereas call does not. This means that using callvirt on a null receiver will raise a NullReferenceException, whereas call will happily call the method and pass null as the first parameter, meaning that the method will get a this parameter which is null.

Sound surprising? It is. IIRC in very early .NET versions call was used in the way you suggest, and people got very confused about how this could be null inside a method. The compiler switched to callvirt to force the runtime to do a null-check upfront.

There are only a handful of places where the compiler will emit a call:

Static methods.
Non-virtual struct methods.
Calling a base method or base constructor (where we know the receiver is not null, and we also explicitly do not want to make a virtual call).
Where the compiler is certain that the receiver is not null, e.g. foo?.Method() where Method is non-virtual.

That last point in particular means that making a method virtual is a binary-breaking change.

Just for fun, see this check for this == null in String.Equals.

The second thing is that _something.Call("test"); is not a virtual call, it's a constrained virtual call. There's a constrained opcode which appears before it.

Constrained virtual calls were introduced with generics. The problem is that method calls on classes and on structs are a bit different:

For classes, you load the class reference (e.g. with ldloc), then use call / callvirt .
For structs, you load the address of the struct (e.g. with ldloc.a), then use call.
To call an interface method on a struct, or a method defined on object, you need to load the struct value (e.g. with ldloc), box it, then use call / callvirt.

If a generic type is unconstrained (i.e. it could be a class or a struct), the compiler doesn't know what to do: should it use ldloc or ldloc.a? Should it box or not? call or callvirt?

Constrained virtual calls move this responsibility to the runtime. To quote the doc above:

When a callvirt method instruction has been prefixed by constrained thisType, the instruction is executed as follows:

If thisType is a reference type (as opposed to a value type) then ptr is dereferenced and passed as the 'this' pointer to the callvirt of method.

If thisType is a value type and thisType implements method then ptr is passed unmodified as the 'this' pointer to a call method instruction, for the implementation of method by thisType.

If thisType is a value type and thisType does not implement method then ptr is dereferenced, boxed, and passed as the 'this' pointer to the callvirt method instruction.

This last case can occur only when method was defined on System.Object, System.ValueType, or System.Enum and not overridden by thisType. In this case, the boxing causes a copy of the original object to be made. However, because none of the methods of System.Object, System.ValueType, and System.Enum modify the state of the object, this fact cannot be detected.