I have 2 related tables in my SQL Server Database : Transaction and TransactionHistory. It's a one-to-many relation. They are, basically, structured like this :
Transaction
TransactionID
Tries
(other fields)
TransactionHistory
TransactionHistoryID
TransactionID (a foreign key)
CreationDate
Status
ErrorType
(other fields)
In my EF Core object model, I have the corresponding classes :
public class Transaction
{
public int TransactionID { get; set; }
public int Tries {get; set; }
public ICollection<TransactionHistory> TransactionHistories{ get; set; }
/// Other fields
}
public class TransactionHistory
{
public int TransactionHistoryID { get; set; }
public int TransactionID { get; set; }
public DateTime CreationDate { get; set; }
public string Status{ get; set; }
public string ErrorType { get; set; }
public Transaction Transaction { get; set; }
/// Other fields
}
What I need is to fetch records from the Transaction table with theirs most recents TransactionHistory rows, and filter the transactions records according to values from this child row. For example, I have a LINQ query to fetch the transactions who are either in a "ready" state, or in a "error" state that can be tried again, which look like this :
var retriableErrorTypes = new List<string>(){"CONNECTION", "TECHNICAL"};
var MaxTries = 3;
DbContext.Transaction
.Include(t => t.TransactionHistories.OrderByDescending(h => h.CreationDate).Take(1))
.Where(t => t.TransactionHistories.OrderByDescending(h => h.CreationDate).First().Status == "READY"
|| (t.TransactionHistories.OrderByDescending(h => h.CreationDate).First().Status == "ERROR"
&& retriableErrorTypes.Contains(t.TransactionHistories.OrderByDescending(h => h.CreationDate).First().ErrorType)
&& t.Tries < MaxTries));
The results seems correct at first glance, although I have yet to fully test this. However, it's not great having to constantly repeat the subquery to get the most recent child row. In fact, this repetition end up in the generated SQL Query (according to LINQPad) :
SELECT [t].[TransactionID], [t].[Tries], [t2].[TransactionHistoryID], [t2].[CreationDate], [t2].[Status], [t2].[ErrorType], [t2].[TransactionID]
FROM [dbo].[Transaction] AS [t]
LEFT JOIN (
SELECT [t1].[TransactionHistoryID], [t1].[CreationDate], [t1].[Status], [t1].[ErrorType], [t1].[TransactionID]
FROM (
SELECT [t0].[TransactionHistoryID], [t0].[CreationDate], [t0].[Status], [t0].[ErrorType], [t0].[TransactionID], ROW_NUMBER() OVER(PARTITION BY [t0].[TransactionID] ORDER BY [t0].[CreationDate] DESC) AS [row]
FROM [dbo].[TransactionHistory] AS [t0]
) AS [t1]
WHERE [t1].[row] <= 1
) AS [t2] ON [t].[TransactionID] = [t2].[TransactionID]
WHERE ((
SELECT TOP(1) [t3].[Status]
FROM [dbo].[TransactionHistory] AS [t3]
WHERE [t].[TransactionID] = [t3].[TransactionID]
ORDER BY [t3].[CreationDate] DESC) = N'READY') OR ((((
SELECT TOP(1) [t4].[Status]
FROM [dbo].[TransactionHistory] AS [t4]
WHERE [t].[TransactionID] = [t4].[TransactionID]
ORDER BY [t4].[CreationDate] DESC) = N'ERROR') AND (
SELECT TOP(1) [t5].[ErrorType]
FROM [dbo].[TransactionHistory] AS [t5]
WHERE [t].[TransactionID] = [t5].[TransactionID]
ORDER BY [t5].[CreationDate] DESC) IN (N'CONNECTION', N'TECHNICAL')) AND ([t].[Tries] < @__MaxTries_1))
ORDER BY [t].[TransactionID], [t2].[TransactionID], [t2].[CreationDate] DESC, [t2].[TransactionHistoryID]
This isn't efficient at all. Those three subqueries in the WHERE clause and the one in the FROM clause all end up fetching the same rows. I would like to find a way to write the LINQ query to minimize this repetition as much as possible, but so far I got nothing better.
I tried using projection, with something like this :
DbContext.Transaction.Select(t => new {t, TransactionHistory = t.TransactionHistories.OrderByDescending(h => h.CreationDate).First()})
.Where(t => t.TransactionHistory.Status == "READY"
|| (t.TransactionHistory.Status == "ERROR"
&& retriableErrorTypes.Contains(t.TransactionHistory.ErrorType)
&& t.t.Tries < MaxTries));
However, while the LINQ query is more readable and doesn't repeat itself, it has no effect on the generated SQL query, who is pretty much identical (according to LINQPad, anyway). Also, it return the records in an anonymous type instead of the Transaction type, which is not what I want, but I could simply reconvert the results, so that's not really an issue.
My question is this : how can I efficiently fetch a specific child row and filter the parent records based on this child row, in a LINQ Query?
Note : As specified in the tags, this is for a .NET 5 application with Entity Framework Core, although it will very likely be upgraded to .NET 6. Also, this is a new system, and a new database, so it's possible to adjust the database structure if required.
CodePudding user response:
Try the following query, it should be effective:
var retriableErrorTypes = new List<string>(){"CONNECTION", "TECHNICAL"};
var MaxTries = 3;
var query =
from t in DbContext.Transaction
from h in t.TransactionHistories.OrderByDescending(h => h.CreationDate).Take(1)
where h.Status == "READY"
|| (h.Status == "ERROR" && retriableErrorTypes.Contains(h.ErrorType) && t.Tries < MaxTries)
select new
{
Transaction = t,
LastHistory = h
};