Home > Blockchain >  T-SQL/SMSS - Highest Value in Date-Range by Location
T-SQL/SMSS - Highest Value in Date-Range by Location

Time:12-29

I am trying to find out the highest-grossing day per location (date and gross sales amount) in my dataset. I understand that Rank ordering may help here, but im a little stuck on how to go about that. Here is the script so far:

SELECT 
    DateKey,
    SUM(GrossSales),
    LocationKey    
FROM 
    dataset
GROUP BY
    DateKey, LocationKey
ORDER BY
    DateKey, LocationKey

I also tried this script too below, but it only returns the highest grossing day of all locations, not by each and every location:

SELECT 
    DateKey,
    LocationKey,
    SUM(GrossSales)
FROM 
    dataset
GROUP BY 
    DateKey, LocationKey
HAVING 
    SUM(GrossSales) = (SELECT MAX(GrossSales) 
                       FROM 
                           (SELECT SUM(GrossSales) AS GrossSales 
                            FROM dataset 
                            GROUP BY Datekey, LocationKey) a )

This is how is how I would like it to look like, as an example:

enter image description here

Is there an easy work around here, that I just cant see?

Thank you!

CodePudding user response:

One approach is to perform the initial GROUP BY calculation in a subquery and include a ROW_NUMBER() or RANK() window function to assign sequence numbers. The outer query can then filter for sequence = 1.

Something like:

SELECT
    DateKey,
    GrossSales,
    LocationKey
FROM (
    SELECT 
        DateKey,
        SUM(GrossSales) AS GrossSales,
        LocationKey,
        ROW_NUMBER() OVER(PARTITION BY LocationKey ORDER BY SUM(GrossSales) DESC) AS RN
    FROM 
        dataset
    GROUP BY
        DateKey, LocationKey
) A
WHERE RN = 1
ORDER BY
    LocationKey

Replacing ROW_NUMBER() with RANK() would include all tied values.

See this db<>fiddle for a demo.

CodePudding user response:

Similar to T N's answer above... Full test script

DROP TABLE IF EXISTS #tblTable1;
CREATE TABLE #tblTable1(DateKey date, LocationKey int, GrossSales decimal(28,2));

INSERT INTO #tblTable1(DateKey,LocationKey,GrossSales)
VALUES('2022-12-23', 1, 184611.714)
    ,('2021-12-27', 2, 150715.34)
    ,('2020-12-26', 3, 145581.95)
    ,('2018-12-26', 4, 132299.35)
    ,('2019-12-23', 5, 132241.59)
    ,('2017-12-26', 6, 131782.52)
    ,('2022-12-17', 7, 129764.68)
    ,('2022-12-29', 7, 9999999.68) --Added
    ,('2021-12-23', 8, 129029.79)
    ,('2022-12-21', 9, 126959.52)
    ,('2019-12-26', 10, 124160.79);

WITH cte_Stuff(LocationKey, DateKey, GrossSales, RowId)
AS (
    SELECT TOP 100 PERCENT LocationKey, DateKey, GrossSales, DENSE_RANK() OVER(PARTITION BY LocationKey ORDER BY LocationKey, GrossSales DESC) AS RowId --TOP 100 PERCENTS is Lazy Mans way of trying to enfore sort order within the CTE
    FROM #tblTable1 t1
    ORDER BY LocationKey, GrossSales DESC
    )
SELECT LocationKey, DateKey, GrossSales
FROM cte_Stuff
WHERE RowId = 1
  • Related