Home > other >  Linq / Entity Framework select latest recorded for a group
Linq / Entity Framework select latest recorded for a group

Time:03-29

I am trying to figure out how to write a LINQ / Entity Framework query to return the latest data available for each symbol in a table.

My database table looks like this:

ID    symbol    price_date    price
------------------------------------
1     AAPL      2022-02-28    174.50
2     MSFT      2022-02-28    307.20
3     AAPL      2021-03-01    172.23
4     MSFT      2021-03-01    304.15

Not every symbol has a record for every day though. The ID key is sequential and is safe for use as the highest ID for a given symbol will contain the latest data.

If I was writing a SQL query, the following would return what I'm looking for:

select prices.*
from prices 
where id in (select max(id) from prices group by symbol)

In Linq, I'm having trouble making this into a single query. What I have so far is dividing it into two queries:

var maxIds = from pp in ctx.Prices
             group pp by pp.Symbol
                 into maxIdBySymbol
             select maxIdBySymbol.Max(pp => pp.Id);

var latestPrices = ctx.Prices.Where(it => maxIds.Contains(it.Id)).ToList();

Is there a way to make this a single query in LINQ?

Thanks

CodePudding user response:

You can combine Where with not Any:

ctx.Prices.Where(prices1 => !prices1.Any(prices2 => (prices2.Id > prices1.Id) && (prices1.symbol.Equals(prices2.symbol))))

CodePudding user response:

So you will first make groups of records, where every group contains only records for one specific symbol. So you will have one group that contains the records for symbol AAPL, one group that contains the records for symbol MSFT, etc.

I am trying ... query ... the latest data available for each symbol in a table.

So, once you've got the groups, you select one element in the group. According to your requirement you select the newest element, which is the element with the highest value for PriceDate. As you said, you could also take the element with the highest value for property ID. Personally I wouldn't do that, because if in a very far future your IDs are not in ascending date anymore, for instance because you add the feature to edit PriceDate after an input error.

For this, I would use the overload of Queryable.GroupBy that has a parameter resultSelector. Use the resultSelector to select the one element of each group that you want.

var newestRecordPerSymbol = dbContext.PriceRecords

// make groups of priceRecords with same value for property Symbol
.GroupBy( priceRecord => priceRecord.Symbol,

// parameter resultSelector: for every symbol and all priceRecords
// that have this symbol, take the newest one
// = order by descending PriceDate and take the first one
(symbol, priceRecordsWithThisSymbol) => priceRecordsWithThisSymbol
    .OrderByDescending(priceRecord => priceRecord.PriceDate)
    .FirstOrDefault();

In words: from the table of PriceRecords, make groups of PriceRecords that have the same value for property Symbol. From every combination of Symbol, and PriceRecords that have this symbol, order all PriceRecords by descending value for property PriceDate, and keep only the first one.

Every group has at least one element, so you could have used First as well as FirstOrDefault. Some versions of EntityFramework or DBMS have problems using First. If you encounter this problem, use FirstOrDefault.

If you still want to take the one with the highest ID:

  .OrderByDescending(priceRecord => priceRecord.ID)
  .FirstOrDefault(),
  • Related