Home > Enterprise >  Execute custom SQL before running FromSqlRaw in Entity Framework Core 6 or above in SQL Server
Execute custom SQL before running FromSqlRaw in Entity Framework Core 6 or above in SQL Server

Time:08-03

I only need it to work for SQL Server. This is an example. The question is about a general approach.

There is a nice extension method from https://entityframework-extensions.net called WhereBulkContains. It is, sort of, great, except that the code of the methods in this library is obfuscated and they do not produce valid SQL when .ToQueryString() is called on IQueryable<T> with these extension methods applied.

Subsequently, I can't use such methods in production code as I am not "allowed" to trust such code due to business reasons. Sure, I can write tons of tests to ensure that WhereBulkContains works as expected, except that there are some complicated cases where the performance of WhereBulkContains is well below stellar, whereas properly written SQL works in a blink of an eye. And (read above), since the code of this library is obfuscated, there is no way to figure out what's wrong there without spending a large amount of time. We would've bought the license (as this is not a freeware) if the library weren't obfuscated. All together that basically kills the library for our purposes.

This is where it gets interesting. I can easily create and populate a temporary table, e.g. (I have a table called EFAgents with an int PK called AgentId in the database):

private string GetTmpAgentSql(IEnumerable<int> agentIds) => @$"
drop table if exists #tmp_Agents;
create table #tmp_Agents (AgentId int not null, primary key clustered (AgentId asc));
{(agentIds
    .Chunk(1_000)
    .Select(e => $@"
insert into #tmp_Agents (AgentId)
values
({e.JoinStrings("), (")});
")
    .JoinStrings(""))}

select 0 as Result
";

private const string AgentSql = @"
select a.* from EFAgents a inner join #tmp_Agents t on a.AgentID = t.AgentId";    

where GetContext returns EF Core database context and JoinStrings comes from Unity.Interception.Utilities and then use it as follows:

private async Task<List<EFAgent>> GetAgents(List<int> agentIds)
{
    var tmpSql = GetTmpAgentSql(agentIds);

    using var ctx = GetContext();

    // This creates a temporary table and populates it with the ids.
    // This is a proprietary port of EF SqlQuery code, but I can post the whole thing if necessary. 
    var _ = await ctx.GetDatabase().SqlQuery<int>(tmpSql).FirstOrDefaultAsync();

    // There is a DbSet<EFAgent> called Agents.
    var query = ctx.Agents
        .FromSqlRaw(AgentSql)
        .Join(ctx.Agents, t => t.AgentId, a => a.AgentId, (t, a) => a);

    var sql = query.ToQueryString()   Environment.NewLine;

    // This should provide a valid SQL; https://entityframework-extensions.net does NOT!
    // WriteLine - writes to console or as requested. This is irrelevant to the question. 
    WriteLine(sql);

    var result = await query.ToListAsync();
    return result;
}

So, basically, I can do what I need in two steps:

using var ctx = GetContext();

// 1. Create a temp table and populate it - call GetTmpAgentSql.
...

// 2. Build the join starting from `FromSqlRaw` as in example above.

This is doable, half-manual, and it is going to work.

The question is how to do that in one step, e.g., call:

.WhereMyBulkContains(aListOfIdConstraints, whateverElseIsneeded, ...)

and that's all.

I am fine if I need to pass more than one parameter in each case in order to specify the constraints.

To clarify the reasons why do I need to go into all these troubles. We have to interact with a third party database. We don't have any control of the schema and data there. The database is large and poorly designed. That resulted in some ugly EFC LINQ queries. To remedy that, some of that ugliness was encapsulated into a method, which takes IQueryable<T> (and some more parameters) and returns IQueryable<T>. Under the hood this method calls WhereBulkContains. I need to replace this WhereBulkContains by, call it, WhereMyBulkContains, which would be able to provide correct ToQueryString representation (for debugging purposes) and be performant. The latter means that SQL should not contain in clause with hundreds (and even sometimes thousands) of elements. Using inner join with a [temp] table with a PK and having an index on the FK field seem to do the trick if I do that in pure SQL. But, ... I need to do that in C# and effectively in between two LINQ method calls. Refactoring everything is also not an option because that method is used in many places.

Thanks a lot!

CodePudding user response:

I think you really want to use a Table Valued Parameter.

Creating an SqlParameter from an enumeration is a little fiddly, but not too difficult to get right;

CREATE TYPE [IntValue] AS TABLE (
    Id int NULL
)
private IEnumerable<SqlDataRecord> FromValues(IEnumerable<int> values)
{
    var meta = new SqlMetaData(
        "Id",
        SqlDbType.Int
    );
    foreach(var value in values)
    {
        var record  = new SqlDataRecord(
            meta
        );
        record.SetInt32(0, value);
        yield return record;
    }
}
public SqlParameter ToIntTVP(IEnumerable<int> values){
    return new SqlParameter()
    {
        TypeName = "IntValue",
        SqlDbType = SqlDbType.Structured,
        Value = FromValues(values)
    };
}

Personally I would define a query type in EF Core to represent the TVP. Then you can use raw sql to return an IQueryable.

public class IntValue
{
    public int Id { get; set; }
}
modelBuilder.Entity<IntValue>(e =>
{
    e.HasNoKey();
    e.ToView("IntValue");
});

IQueryable<IntValue> ToIntQueryable(DbContext ctx, IEnumerable<int> values)
{
    return ctx.Set<IntValue>()
        .FromSqlInterpolated($"select * from {ToIntTVP(values)}");
}

Now you can compose the rest of your query using Linq.

var ids = ToIntQueryable(ctx, agentIds);
var query = ctx.Agents
    .Where(a => ids.Any(i => i.Id == a.Id));

CodePudding user response:

I would propose to use linq2db.EntityFrameworkCore (note that I'm one of the creators). It has built-in temporary tables support.

We can create simple and reusable function which filters records of any type:

public static class HelperMethods
{
    private class KeyHolder<T>
    {
        [PrimaryKey]
        public T Key { get; set; } = default!;
    }

    public static async Task<List<TEntity>> GetRecordsByIds<TEntity, TKey>(this IQueryable<TEntity> query, IEnumerable<TKey> ids, Expression<Func<TEntity, TKey>> keyFunc)
    {
        var ctx = LinqToDBForEFTools.GetCurrentContext(query) ??
                        throw new InvalidOperationException("Query should be EF Core query");

        // based on DbContext options, extension retrieves connection information
        using var db = ctx.CreateLinqToDbConnection();

        // create temporary table and BulkCopy records into that table
        using var tempTable = await db.CreateTempTableAsync(ids.Select(id => new KeyHolder<TKey> { Key = id }), tableName: "temporaryIds");

        var resultQuery = query.Join(tempTable, keyFunc, t => t.Key, (q, t) => q);

        // we use ToListAsyncLinqToDB to avoid collission with EF Core async methods.
        return await resultQuery.ToListAsyncLinqToDB();
    }
}

Then we can rewrite your function GetAgents to the following:

private async Task<List<EFAgent>> GetAgents(List<int> agentIds)
{
    using var ctx = GetContext();

    var result = await ctx.Agents.GetRecordsByIds(agentIds, a => a.AgentId);

    return result;
}
  • Related