I have a deserialized Json object that I am trying to filter before processing. The data looks like this...
Company Division LastModDate Lot's of other columns/objects
123 1 7/1/2021
123 1 8/1/2022
123 2 8/1/2022
How can I get all the information in the original object and get rid of records that are not the latest for each Company/Division group?
I tried this...
var filtered = origObject.GroupBy(g=> new {g.Company,g.Division})
I don't know where to go next.
If I were doing this in SQL then I would be using row_number and just taking the 1 for example.
CodePudding user response:
You could try something like
var filtered = origObject
.GroupBy(x => new {g.Company,g.Division})
.Select(g => g.OrderByDescending(x => x.LastModDate).First());
This will select one latest object from each group.
Edit: I'm not sure without a compiler at hand if this will group correctly - your grouping key is an anonymous object, I don't remember if they have any equality comparer other than by reference. You could try using a record instead, records have equality by value of all their properties - .GroupBy(g => (g.Company,g.Division))
. Or just group by a string key such as $"{g.Company},{g.Division}"
,
CodePudding user response:
A much more efficient way of doing this is as follows:
var filtered = origObject
.OrderByDescending(w => w.LastModDate)
.DistinctBy(w => (w.Company, w.Division));
This avoids the heavy array allocation and copying of GroupBy
, especially since you only care about one item from its result.