Home > OS >  Why does passing elements of an array by reference explicitly cause assignment operations in IL?
Why does passing elements of an array by reference explicitly cause assignment operations in IL?

Time:11-11

I created the following SSCCE:

Module Module1

Sub Main()
    Dim oList As ArrayList = New ArrayList()
    oList.Add(New Object())
    For Each o As Object In oList
        subA(oList)
    Next

End Sub

Private Sub subA(ByRef oList As ArrayList)
    subB(oList(0))
End Sub

Private Sub subB(ByRef oObj As Object)
    oObj.ToString()
End Sub

End Module

This code compiles down to the following IL:

[StandardModule]
internal sealed class Module1
{
[STAThread]
public static void Main()
{
    ArrayList oList = new ArrayList();
    oList.Add(RuntimeHelpers.GetObjectValue(new object()));
    IEnumerator enumerator = default(IEnumerator);
    try
    {
        enumerator = oList.GetEnumerator();
        while (enumerator.MoveNext())
        {
            object o = RuntimeHelpers.GetObjectValue(enumerator.Current);
            subA(ref oList);
        }
    }
    finally
    {
        if (enumerator is IDisposable)
        {
            (enumerator as IDisposable).Dispose();
        }
    }
}

private static void subA(ref ArrayList oList)
{
    ArrayList obj = oList;
    object oObj = RuntimeHelpers.GetObjectValue(obj[0]);
    subB(ref oObj);
    obj[0] = RuntimeHelpers.GetObjectValue(oObj);
}

private static void subB(ref object oObj)
{
    oObj.ToString();
}
}

Take note of the assignment that occurs in subA(ArrayList).

I ask why this is happening because I was asked by a fellow developer to take a look at an error they were getting in a particular workflow involving custom code. A collection was being modified while iterating over it when the source code appeared to only ever perform get operations on the collection. I determined that the error was being introduced by the explicit use of byref, and indeed, if I remove the byref keyword from the method signature the IL that is generated looks like this:

[StandardModule]
internal sealed class Module1
{
    [STAThread]
    public static void Main()
    {
        ArrayList oList = new ArrayList();
        oList.Add(RuntimeHelpers.GetObjectValue(new object()));
        IEnumerator enumerator = default(IEnumerator);
        try
        {
            enumerator = oList.GetEnumerator();
            while (enumerator.MoveNext())
            {
                object o = RuntimeHelpers.GetObjectValue(enumerator.Current);
                subA(ref oList);
            }
        }
        finally
        {
            if (enumerator is IDisposable)
            {
                (enumerator as IDisposable).Dispose();
            }
        }
    }

    private static void subA(ref ArrayList oList)
    {
        subB(RuntimeHelpers.GetObjectValue(oList[0]));
    }

    private static void subB(object oObj)
    {
        oObj.ToString();
    }
}

Note that now, there is no assignment. I don't entirely understand this behavior but it seems like it could be a painful gotchya for developers, and clearly was in my case. Could someone elaborate on the reasoning behind why the IL generates in this manner? Shouldn't these two variants of the original source code compile to identical IL given that I am passing reference types around exclusively? Aren't they all by ref? Any info which helps me understand the mechanism(s) at play here would be appreciated.

CodePudding user response:

Let's take a look at the VB.Net specification to see what's going on:

9.2.5.2 Reference Parameters

Reference parameters act in two modes, either as aliases or through copy-in copy-back.

Aliases. A reference parameter is used when the parameter acts as an alias for a caller-provided argument. A reference parameter does not itself define a variable, but rather refers to the variable of the corresponding argument. Modifications of a reference parameter directly and immediately impact the corresponding argument

Copy-in copy-back. If the type of the variable being passed to a reference parameter is not compatible with the reference parameter's type, or if a non-variable (e.g. a property) is passed as an argument to a reference parameter, or if the invocation is late-bound, then a temporary variable is allocated and passed to the reference parameter. The value being passed in will be copied into this temporary variable before the method is invoked and will be copied back to the original variable (if there is one and if it's writable) when the method returns. Thus, a reference parameter may not necessarily contain a reference to the exact storage of the variable being passed in, and any changes to the reference parameter may not be reflected in the variable until the method exits.

So, because the storage location oList is not compatible with ref object according to CLR rules (because it could cause a non ArrayList object to be inserted), there is no way for the compiler to pass the location straight through.

So it utilizes copy-in copy-back instead.

What happens when the method returns?

If the new object is not compatible you get an exception after the method completes

When returning from F (previous example), the value in the temporary variable is cast back to the type of the variable, Derived, and assigned to d. Since the value being passed back cannot be cast to Derived, an exception is thrown at run time.


For clarity, C# will not allow this at all, as you can see in this fiddle. This is a specific VB problem because it allows ByRef to do copy-in copy-back conversions.

  • Related