Home > database >  Why is my System.Linq.Expressions code-gen slow at runtime and how can I make it faster?
Why is my System.Linq.Expressions code-gen slow at runtime and how can I make it faster?

Time:09-24

I am trying to implement a function that returns functions that calculate the vectors' scalar product. It should be implemented via generics, but it seems possible only by generating code in the run time. Read several docs about code generation by building expression trees and this is what I have written so far:

public static Func<T[], T[], T> GetVectorMultiplyFunction<T>()
    where T : struct
{
    ParameterExpression first  = Expression.Parameter(typeof(T[]), "first" );
    ParameterExpression second = Expression.Parameter(typeof(T[]), "second");
    ParameterExpression result = Expression.Parameter(typeof(T)  , "result");
    ParameterExpression index  = Expression.Parameter(typeof(int), "index" );

    LabelTarget label = Expression.Label(typeof(T));

    BlockExpression block = Expression.Block(
        new[] { result, index },
        Expression.Assign( result, Expression.Constant(0) ),
        Expression.Assign( index , Expression.Constant(0) ),

        Expression.Loop(
            Expression.IfThenElse(
                Expression.LessThan( index, Expression.ArrayLength( first ) ),
                Expression.Block(
                    Expression.AddAssign( result, Expression.Multiply( Expression.ArrayIndex( first, index ), Expression.ArrayIndex( second, index ) ) ),
                    Expression.Increment( index )
                ),
                Expression.Break( label, result )
            ),
            label
        )
    );

    return Expression
        .Lambda<Func<T[], T[], T>>( block, first, second )
        .Compile();
} 

This builds without problem but takes forever to run tests. I have a hard time wrapping my head around the subject. So I don't know what exactly went wrong.

This is a piece of tests that this method is used:

[Test]
public void GetVectorMultiplyFunctionReturnsFunctionForLong()
{
    var first = new long[] { 1L, 2L, 3L };
    var second = new long[] { 2L, 2L, 2L };
    var expected = 1L * 2L   2L * 2L   3L * 2L;
    var func = CodeGeneration.GetVectorMultiplyFunction<long>();
    var actual = func(first, second);
    Assert.AreEqual(expected, actual);
}

CodePudding user response:

After some debugging in Linqpad, the problem isn't that your dynamic method is "slow" (it isn't), the problem is that the method contains an infinite-loop that never exits.


From what I can tell, your GetVectorMultiplyFunction method is meant to do something like this, if it were written in C# directly:

static T[] VectorMultiply<T>( T[] first, T[] second )
    where T : struct
{
    T     result = default(T);
    Int32 index  = 0;
    
    while( true )
    {
        if( index < first.Length )
        {
            result  = first[index] * second[index];
            index  ;
        }
        else
        {
            return result;
        }
    }
}

...however there's a few major bugs in your code:

  • ParameterExpression result = Expression.Parameter(typeof(T)  , "result");
    ParameterExpression index  = Expression.Parameter(typeof(int), "index" );
    
    • These two lines should use Expression.Variable, not Expression.Parameter as result and index are not method parameters, but method locals.
  • Expression.Assign( result, Expression.Constant(0) )
    
    • This doesn't work because result is typed as T, but Expression.Constant(0) is typed as Int32 (because the 0 literal is an Int32 literal.
    • Change it to use default(T), like so:
      Expression.Assign( result, Expression.Constant( default(T) ) ),
      
  • LabelTarget label = Expression.Label(typeof(T));
    
    • Change the above to this:
      LabelTarget breakLabel = Expression.Label("break");
      
  • Here's the main bug and the cause of the infinite-loop:

    Expression.Increment( index )
    

    The above does increment index, but it doesn't reassign the incremented value back to the index local, so it's the same thing as doing this in C#:

    while( true ) {
        if( index < first.Length )
        {
            result  = first[index]  * second[index];
            index   1;       // <-- *DANGER, WILL ROBINSON!*
        }
        else
        {
            break;
        }
    }
    
    • See the problem? Doing index 1 never actually increases index, so index < first.Length is always true so the while loop never stops.
    • The fix is to change it to index = 1 (or index or index ) like so:
      Expression.PostIncrementAssign( index )
      
  • The last issue is that your outermost Expression.Block's last expression should be the result local, which is equivalent to doing return result; in C#.

    • So immediately after the Expression.Loop() call inside your Expression.Block( variables, expressions ) call-site, just add result as another parameter.
    • I'll confess that I still don't yet fully understand how the breakLabel = Expression.Label("break"); expression works or what it even does, but

This code works for me in .NET 6:

public static Func<T[], T[], T> GetVectorMultiplyFunction<T>()
    where T : struct
{
    var writeLineMethod = typeof(Console).GetMethod( nameof(Console.WriteLine), new[] { typeof(String), typeof(Object) })!; // print-style debugging ugh // `public static void WriteLine(string format, object? arg0)`
    
    ParameterExpression first  = Expression.Parameter( type: typeof(T[])  , name: "first"  );
    ParameterExpression second = Expression.Parameter( type: typeof(T[])  , name: "second" );
    ParameterExpression result = Expression.Variable ( type: typeof(T)    , name: "result" );
    ParameterExpression index  = Expression.Variable ( type: typeof(Int32), name: "index"  );
    
    LabelTarget breakLabel = Expression.Label("break");
    
    BlockExpression block = Expression.Block(
        variables  : new[] { result, index },
        expressions: new Expression[]
        {
            Expression.Assign( result, Expression.Constant( default(T) ) ),
            Expression.Assign( index , Expression.Constant(          0 ) ),
            
            Expression.Loop(
                body: Expression.Block(
                    Expression.IfThenElse(
                        test  : Expression.LessThan( index, Expression.ArrayLength( first ) ),
                        ifTrue: Expression.Block(
                            Expression.AddAssign( result, Expression.Multiply( Expression.ArrayIndex( first, index ), Expression.ArrayIndex( second, index) ) ),
                            Expression.PostIncrementAssign( index ),
                            
                            Expression.Call( writeLineMethod, Expression.Constant( "result: {0}" ), Expression.Convert( result, typeof(Object) ) ),
                            Expression.Call( writeLineMethod, Expression.Constant( "index : {0}" ), Expression.Convert( index , typeof(Object) ) )
                        ),
                        ifFalse: Expression.Break( breakLabel )
                    )
                ),
                @break: breakLabel
            ),
            result
        }
    );
    
    Func<T[],T[],T> f = Expression
        .Lambda< Func<T[],T[],T> >( block, first, second )
        .Compile();
    
    return f;
}

Here's a screenshot of the func returning the correct expected result, as well as the Console.WriteLine output with the logged values of result and index. The method runs instantly (and the Expression.Lambda<>(...).Compile() call only took 0.5ms on my machine too):

enter image description here

  • Related