Home > database >  Entity Framework Core: performance issue
Entity Framework Core: performance issue

Time:08-08

I have some problems with EF Core. Every time when I write some linq in C# for getting data from the database, it adds a useless select * from statement. I can't figure out why it does this.

The raw SQL query works pretty quickly - 100ms vs 300ms using linq

This is the method in C#**:

return (from pr in _db.ex_DocumentExt1_PR
        from doc in _db.ex_Document.Where(doc => doc.DOCID == pr.DOCID).DefaultIfEmpty()
        from docAc in _db.ex_DOCAction.Where(docAc => docAc.DOCID == pr.DOCID).DefaultIfEmpty()
        from st in _db.ex_Status.Where(st => st.STATUS_ID == doc.DOC_STATUS).DefaultIfEmpty()
        from dep in _db.SSO_Entities.Where(dep => dep.Type == SSO_EntityTypes.COMPANY_STRUCTURE && dep.EntityCode == pr.RequestedForDepartamentId.ToString()).DefaultIfEmpty()
        where docAc.ISPERFORMED == 1
              && docAc.ACTOR_ID == uid
              && doc.DOC_NUMBER != "YENI"
              && doc.DOC_NUMBER.Contains(searchText)
        group new { doc, st, dep, docAc } 
           by new { doc.DOCID, doc.DOC_NUMBER, st.SHORT_NAME, dep.DisplayName, docAc.ACTION_PERFORMED } into g1
        orderby g1.Key.ACTION_PERFORMED descending
        select new LastActiveDocumentViewModel
                   {
                        DocId = g1.Key.DOCID,
                        DocNumber = g1.Key.DOC_NUMBER,
                        DocStatus = g1.Key.SHORT_NAME,
                        DocType = DocumentType.PR.ToString(),
                        Supplier = g1.Key.DisplayName,
                        Date = g1.Max(g => g.docAc.ACTION_PERFORMED)
                   });

This is SQL query generated by EF Core:

SELECT TOP (50) 
[Project1].[C2] AS [C1], 
[Project1].[DOCID] AS [DOCID], 
[Project1].[DOC_NUMBER] AS [DOC_NUMBER], 
[Project1].[SHORT_NAME] AS [SHORT_NAME], 
[Project1].[C3] AS [C2], 
[Project1].[DisplayName] AS [DisplayName], 
[Project1].[C1] AS [C3]
FROM ( SELECT 
    [GroupBy1].[A1] AS [C1], 
    [GroupBy1].[K1] AS [DOCID], 
    [GroupBy1].[K2] AS [DOC_NUMBER], 
    [GroupBy1].[K3] AS [ACTION_PERFORMED], 
    [GroupBy1].[K4] AS [SHORT_NAME], 
    [GroupBy1].[K5] AS [DisplayName], 
    1 AS [C2], 
    N'PR' AS [C3]
    FROM ( SELECT 
        [Filter1].[DOCID1] AS [K1], 
        [Filter1].[DOC_NUMBER] AS [K2], 
        [Filter1].[ACTION_PERFORMED] AS [K3], 
        [Filter1].[SHORT_NAME] AS [K4], 
        [Extent5].[DisplayName] AS [K5], 
        MAX([Filter1].[ACTION_PERFORMED]) AS [A1]
        FROM   (SELECT [Extent1].[RequestedForDepartamentId] AS [RequestedForDepartamentId], [Extent2].[DOCID] AS [DOCID1], [Extent2].[DOC_NUMBER] AS [DOC_NUMBER], [Extent3].[ACTOR_ID] AS [ACTOR_ID], [Extent3].[ACTION_PERFORMED] AS [ACTION_PERFORMED], [Extent4].[SHORT_NAME] AS [SHORT_NAME]
            FROM    [dbo].[ex_DocumentExt1_PR] AS [Extent1]
            LEFT OUTER JOIN [dbo].[ex_Document] AS [Extent2] ON [Extent2].[DOCID] = [Extent1].[DOCID]
            INNER JOIN [dbo].[ex_DOCAction] AS [Extent3] ON [Extent3].[DOCID] =  CAST( [Extent1].[DOCID] AS bigint)
            LEFT OUTER JOIN [dbo].[ex_Status] AS [Extent4] ON [Extent4].[STATUS_ID] = [Extent2].[DOC_STATUS]
            WHERE ( NOT (('YENI' = [Extent2].[DOC_NUMBER]) AND ([Extent2].[DOC_NUMBER] IS NOT NULL))) AND (1 = [Extent3].[ISPERFORMED]) ) AS [Filter1]
        LEFT OUTER JOIN [dbo].[SSO_Entities] AS [Extent5] ON ('COMPANY_STRUCTURE' = [Extent5].[Type]) AND (([Extent5].[EntityCode] = (CASE WHEN ([Filter1].[RequestedForDepartamentId] IS NULL) THEN N'' ELSE  CAST( [Filter1].[RequestedForDepartamentId] AS nvarchar(max)) END)) OR (([Extent5].[EntityCode] IS NULL) AND (CASE WHEN ([Filter1].[RequestedForDepartamentId] IS NULL) THEN N'' ELSE  CAST( [Filter1].[RequestedForDepartamentId] AS nvarchar(max)) END IS NULL)))
        WHERE ([Filter1].[ACTOR_ID] = 1018) AND ([Filter1].[DOC_NUMBER] LIKE '%%' ESCAPE '~')
        GROUP BY [Filter1].[DOCID1], [Filter1].[DOC_NUMBER], [Filter1].[ACTION_PERFORMED], [Filter1].[SHORT_NAME], [Extent5].[DisplayName]
    )  AS [GroupBy1]
)  AS [Project1]
ORDER BY [Project1].[ACTION_PERFORMED] DESC

This is the raw SQL query I wrote that does the same thing as the Linq query:

SELECT TOP(50)
    doc.DOCID,
    doc.DOC_NUMBER,
    'PR',
    st.SHORT_NAME,
    dep.DisplayName,
    MAX(docAc.ACTION_PERFORMED)
FROM ex_DocumentExt1_PR pr
LEFT JOIN ex_Document doc ON doc.DOCID = pr.DOCID
LEFT JOIN ex_DOCAction docAc ON docAc.DOCID = doc.DOCID
LEFT JOIN ex_Status st ON st.STATUS_ID = doc.DOC_STATUS
LEFT JOIN SSO_Entities dep ON dep.Type = 'COMPANY_STRUCTURE' AND dep.EntityCode = pr.RequestedForDepartamentId
WHERE docAc.ISPERFORMED = 1
  AND docAc.ACTOR_ID = 1018 
  AND doc.DOC_NUMBER != 'Yeni'
GROUP BY doc.DOCID, doc.DOC_NUMBER, st.SHORT_NAME, dep.DisplayName
ORDER BY MAX(docAc.ACTION_PERFORMED) DESC

CodePudding user response:

The queries don't look identical. For example, your query groups by 4 columns, whilst EF query groups by 5 - it has [Filter1].[ACTION_PERFORMED] in its group by clause, in addition to the other four. Depending on your testing data sample, they might behave similarly, but generally the results will differ.

As @allmhuran has noted in the comments, EF has a tendency to generate inefficient queries, especially when more than 2 tables are involved. Personally, when I find myself in such a situation, I create a database view, put the query there, add the view to the DbContext and select directly from it. In extreme scenarios, that might even be a stored procedure. But that's me, I know SQL much better than C#, always use Database First approach and have my database in an accompanying SSDT project.

If you use EF Code First and employ EF Migrations, adding a view might be a bit of a problem, but it should be possible. This question might be a good start.

CodePudding user response:

EF is not intended to be a wrapper for SQL. I don't see any "SELECT *" in the generated SQL, though what you will encounter is a range of inner SELECT statements that EF builds to allow it to join tables that normally don't have established references to one another. This is a necessary evil for EF to be able to query across data based on how you want to relate them.

EF's strength is simplifying data access when working with properly normalized data structures where those relationships can be resolved either through convention or configuration. I don't agree that EF doesn't handle multiple tables "well", it can handle them quite quickly provided they are properly related and indexed. The reality though is that many data systems out there in the wild do not follow proper normalization and you end up with the need to query across loosely related data. EF can do it, but it won't be the most efficient at it.

If this is a new project / database whether leveraging Code First or Schema First, my recommendation would be to establish properly nomalized relationships with FKs and indexes/constraints between the tables.

If this is an existing database where you don't have the option to modify the schema then I would recommend employing a View to bind a desired entity model from where you can employ a more directly optimized SQL expression to get the data you want. This would be a distinct set of entities as opposed to the per-table entities that you would use to update data. The goal being larger, open-ended read operations with loose relationships leading to expensive queries can be optimized down, where update operations which should be "touching" far fewer records at a time can be managed via the table-based entities.

  • Related