Home > Mobile >  Get only rows with the latest date for each name
Get only rows with the latest date for each name

Time:10-19

I'm trying to write a query that returns only those rows that contain the latest date for each name.

So for example, this data:

Name Date Sold More Columns...
Bob 2021-01-05
Mike 2021-01-18
Susan 2021-01-23
Bob 2021-02-04
Susan 2021-02-16
Mike 2021-03-02

Would produce this result:

Name Date Sold More Columns...
Bob 2021-02-04
Susan 2021-02-16
Mike 2021-03-02

It's sort of like a GROUP BY, but I'm not aggregating anything. I only want to filter the original rows.

How could I write such a query?

NOTE: In the end, this will be a SQL Server query but I need to write it using Entity Framework.

UPDATE: In reality, this is part of a much more complex query. It would be extremely difficult for me to implement this as a raw SQL query. If at all possible, I need to implement using Entity Framework.

CodePudding user response:

Two options

Select top 1 with ties *
 From  YourTable 
 Order by row_number() over (partition by Name order by Sold_Date desc)

or slightly more performant

with cte as (
Select *
      ,RN = row_number() over (partition by Name order by Sold_Date desc)
 From  YourTable
)
Select *
 From  cte 
 Where RN=1

CodePudding user response:

try this

;with Groups as
(
    Select [Name], max([Date Sold]) as [Date Sold]
    From Table
    Group By [Name]
)
Select Table.* From Groups
Inner Join Table on Table.[Name] = Groups.Name And Table.[Date Sold] = Groups.[Date Sold]

CodePudding user response:

Adapted from Error while flattening the IQueryable<T> after GroupBy()

var names = _context.Items.Select(row => row.Name).Distinct();
var items =
  from name in names
  from item in _context.Items
    .Where(row => row.Name == name)
    .OrderByDescending(row => row.DateSold)
    .Take(1)
  select item;

var results = items.ToArrayAsync();

Let's break this down:

A query expression which establishes the keys for our next query. Will eventually be run as a subquery.

var names = _context.Items.Select(row => row.Name).Distinct();

Another query, starting with the keys...

var items =
  from name in names

... and for each key, let's find the matching row ...

  from item in _context.Items
    .Where(row => row.Name == name)
    .OrderByDescending(row => row.DateSold)
    .Take(1)

... and we want that row.

  select item;

Run the combined query.

var results = items.ToArrayAsync();

CodePudding user response:

How would you solve this, if you had to do this yourself without a computer?

Well, you would take the first row with Name "Bob". The you would read all other rows with Name "Bob", and from every Row you would check if DateSold is newer than the one you already had. If the date is newer, you would replace it with the current row. If not, continue with the next "Bob" row.

And of course you have to do this with every Name you see.

So let's make a procedure that exactly does this. To make it efficient we enumerate your input sequence only once.

I'll make it an extension method, so you can use it as any other LINQ method. If you are not familiar with extension methods, consider to read Extension Methods Demystified

If you only need to do this this problem, make an extension method for this class only. If you think you want to reuse your class, make it a generic solution.

I don't know the type that is in your sequence, so let's say it is Sales. From every Sale we see the Name of the Person who sold it, the date that it is sold and some other columns.

For Sales only: not reusable

public static IEnumerable<Sale> GetLatestSales(this IEnumerable<Sale> sales)
{
    // TODO: implement
}

Usage will be:

IEnumerable<Sale> sales = ...

int currentMonth = DateTime.Today.Month;
var latestSalesThisMonth = sales.Where(sale => sale.DateSold.Month == currentMonth)
    .GetLatestSales()
    .ToList();

Well if this is what you want, let's implement!

public static IEnumerable<Sale> GetLatestSales(this IEnumerable<Sale> sales)
{
    Dictionary<string, Sale> latestSales = new Dictionary<string, Sale>();

    foreach (Sale sale in sales)
    {
        if (latestSales.TryGetValue(sale.Name, out Sale latestSale))
        {
            // There is already a Sale for this Name in the dictionary.
            // is sale newer than latestSale?
            if (sale.DateSold > latestSale.DateSold)
            {
                // sale is newer, this will be the latestSale for this Name
                latestSales[sale.Name] = sale;
            }
            // else: sale is older than latestSale; do nothing
        }
    }
    return latestSales.Values;
}

Generic solution: keep only the largest of the value of one of the properties

Input parameters:

  • the source sequence

  • a property selector to be used as common value (in Sales this was the Name

  • a property selector to check for "latest"

  • an EqualityComparer to compare the "Name"

  • a Comparer to find the "latest"

  • a ResultSelector, to specify the returned values

    // TODO: invent a proper name public static IEnumerable<T, TKey, TValue> GetLargest( this IEnumerable source, Func<T, TKey> groupSelector, Func<T, TValue> largestSelector) { return GetLargest(source, groupSelector, largestSelector, null, null, null }

    public static IEnumerable<T, TKey, TValue, TResult> GetLargest( this IEnumerable source, Func<T, TKey> groupSelector, Func<T, TValue> largestSelector, Func<T, TResult> resultSelector) { return GetLargest(source, groupSelector, largestSelector, resultSelector, null, null }

    public static IEnumerable<T, TKey, TValue, TResult> GetLargest( this IEnumerable source, Func<T, TKey> groupSelector, Func<T, TValue> largestSelector, Func<T, TResult> resultSelector, IEqualityComparer keyComparer, IComparer valueComparer) { // TODO: check input values, e.g. not null etc if (source == null) throw ArgumentNullException(nameof(source)); ...

     // resultSelector may be null
     // if comparer null, use default comparer
     if (keyComparer == null) keyComparer = EqualityComparer.Default;
     if (valueComparer == null) valueComparer = Comparer.Default;
    
     // code is similar to above:
     Dictionary<TKey, T> dictionary = new Dictionary<TKey, T>(keyComparer);
    
     foreach (T sourceElement in source)
     {
         TKey key = keySelector(sourceElement);
         if (dictionary.TryGetValue(key, out Sale dictionaryElement))
         {
             TValue sourceValue = valueSelector(sourceElement);
             TValue dictionaryValue = valueSelector(dictionaryElement);
             if (valueComparer.Compare(sourceValue, dictionaryValue) > 0)
             {
                 // sourceValue is newer, 
                 dictionary[key] = sourceElement;
             }
         }
     }
    
     if (resultSelector == null)
         return dictionary.Values;
     else
         return dictionary.Values.Select(dictionaryElement => resultSelecto(dictionaryElement);
    

    }

  • Related