Home > database >  Using the SUM OVER clause, how to check sum over period only when output is not greater than a certa
Using the SUM OVER clause, how to check sum over period only when output is not greater than a certa

Time:10-12

Sample data:

select date, agent, sales
from agentsales

date                    agent   sales
2021-01-03 00:00:00.000 Agent A 10
2021-02-05 00:00:00.000 Agent A 15
2021-03-10 00:00:00.000 Agent A 10
2021-01-05 00:00:00.000 Agent B 5
2021-02-06 00:00:00.000 Agent B 28
2021-03-10 00:00:00.000 Agent B 5
2021-01-02 00:00:00.000 Agent C 35
2021-02-04 00:00:00.000 Agent C 25
2021-03-08 00:00:00.000 Agent C 15
2021-01-01 00:00:00.000 Agent D 5
2021-02-02 00:00:00.000 Agent D 35
2021-03-10 00:00:00.000 Agent D 31

I want to get the counts of agents who have crossed 30 sales, such that if they have never crossed a total of 30 sales then consider sum over current and previous months, otherwise only current month.

Expected output:

YrMon  Count_Agent_more_than_30_sales
Jan21  1
Feb21  2
Mar21  2

Logic:

Jan21 - 1 since only C has crossed 30 sales
Feb21 - 2 since B and D have crossed 30 sales
Mar21 - 2 since A and D have crossed 30 sales

My query to calculate sum over period:

;WITH CTE AS (SELECT CAST(YEAR([DATE]) AS VARCHAR) ' ' CAST(MONTH([DATE]) AS VARCHAR) YRMON, AGENT, SUM(SALES) SALES
  FROM AgentSales
  GROUP BY  CAST(YEAR([DATE]) AS VARCHAR) ' ' CAST(MONTH([DATE]) AS VARCHAR), AGENT
  )
  SELECT *, SUM(SALES) OVER(PARTITION BY AGENT ORDER BY YRMON) SUMOVERPERIOD FROM CTE
  ORDER BY 2,1

YRMON   AGENT   SALES   SUMOVERPERIOD
2021 1  Agent A 10      10
2021 2  Agent A 15      25
2021 3  Agent A 10      35
2021 1  Agent B 5       5
2021 2  Agent B 28      33
2021 3  Agent B 5       38
2021 1  Agent C 35      35
2021 2  Agent C 25      60
2021 3  Agent C 15      75
2021 1  Agent D 5       5
2021 2  Agent D 35      40
2021 3  Agent D 31      71

Now I am trying to apply the logic on the calculated sum:

   ;WITH CTE AS (SELECT CAST(YEAR([DATE]) AS VARCHAR) ' ' CAST(MONTH([DATE]) AS VARCHAR) YRMON, AGENT, SUM(SALES) SALES
  FROM AgentSales
  GROUP BY  CAST(YEAR([DATE]) AS VARCHAR) ' ' CAST(MONTH([DATE]) AS VARCHAR), AGENT
  )
  SELECT *, SUM(SALES) OVER(PARTITION BY AGENT ORDER BY YRMON) SUMOVERPERIOD,
  CASE WHEN SUM(SALES) OVER(PARTITION BY AGENT ORDER BY YRMON)>30 THEN 1 ELSE 0 END AS CALC
  FROM CTE
  ORDER BY 2,1

YRMON   AGENT   SALES   SUMOVERPERIOD   CALC
2021 1  Agent A 10      10              0
2021 2  Agent A 15      25              0
2021 3  Agent A 10      35              1
2021 1  Agent B 5       5               0
2021 2  Agent B 28      33              1
2021 3  Agent B 5       38              1
2021 1  Agent C 35      35              1
2021 2  Agent C 25      60              1
2021 3  Agent C 15      75              1
2021 1  Agent D 5       5               0
2021 2  Agent D 35      40              1
2021 3  Agent D 31      71              1

This query is always considering sum over current and previous period.

How to check whether the sales has previously crossed the 30 sales mark and for such cases to exclude doing the sum over period? For example can we apply LAG on the result of the SUM OVER column?

CodePudding user response:

Looks like this should work for you

  • You need to pre-aggregate the sales per agent and month, then get a running sum of that aggregate
  • Then simply check if each row has crossed over in this month by comparing the current data with the running sum
SELECT
  YrMon = FORMAT(Month, 'yyyy MM'),
  Count_Agent_more_than_30_sales = COUNT(CASE WHEN SumOverPeriod >= 30 AND SumOverPeriod - sales < 30 THEN 1 END)
FROM (
    SELECT
      Month = EOMONTH(date),
      agent,
      sales = SUM(sales),
      SumOverPeriod = SUM(SUM(sales)) OVER (PARTITION BY agent ORDER BY EOMONTH(date)
          ROWS UNBOUNDED PRECEDING)
    FROM AgentSales
    GROUP BY EOMONTH(date), agent
) sales
GROUP BY Month;

db<>fiddle

CodePudding user response:

Please check if one of these fits your needs (I think the description confusion)

Option 1

-- If you want to count only the first time [agent] crossed 30 sales
;With MyCTE01 as (
    SELECT 
        [date] = EOMONTH([date], -1),
        [agent],[sales], 
        S = SUM([sales]) OVER (PARTITION BY [agent] ORDER BY [date] ROWS BETWEEN UNBOUNDED PRECEDING and CURRENT ROW)
    FROM [AgentSales]
),
MyCTE02 as (
    SELECT [date],[agent],[sales], S
    FROM MyCTE01
    -- The idea of using "and S - [sales] < 30" instead of ROW_NUMBER came from @Charlieface, but it is better to do the work on DATE data type and not on string
    WHERE S > 30 and S - [sales] < 30
)
SELECT DATENAME(month,[Date]), YEAR([Date]), COUNT(*) 
FROM MyCTE02
GROUP BY [date]
GO

Option 2

-- If you want to count all the [agent] crossed 30 sales till now
;With MyCTE01 as (
    SELECT 
        [date] = DATEADD(DAY, 1, EOMONTH([date], -1)),
        [agent],[sales], 
        S = SUM([sales]) OVER (PARTITION BY [agent] ORDER BY [date] ROWS BETWEEN UNBOUNDED PRECEDING and CURRENT ROW)
    FROM [AgentSales]
)
,MyCTE02 as (
    SELECT [date],[agent],[sales], S
    FROM MyCTE01
    WHERE S > 30
)
SELECT DATENAME(month,[Date]), YEAR([Date]), COUNT(*) 
FROM MyCTE02
GROUP BY [date]
GO

DDL DML

USE tempdb
GO

DROP TABLE IF EXISTS [AgentSales]
GO
CREATE TABLE [AgentSales](id INT IDENTITY(1,1), [date] DATE, agent VARCHAR(100), sales INT)
GO
INSERT [AgentSales]([date],[agent],[sales]) VALUES
('2021-01-03 00:00:00.000','Agent A', 10),
('2021-02-05 00:00:00.000','Agent A', 15),
('2021-03-10 00:00:00.000','Agent A',10),
('2021-01-05 00:00:00.000','Agent B',5 ),
('2021-02-06 00:00:00.000','Agent B',28),
('2021-03-10 00:00:00.000','Agent B',5 ),
('2021-01-02 00:00:00.000','Agent C',35),
('2021-02-04 00:00:00.000','Agent C',25),
('2021-03-08 00:00:00.000','Agent C',15),
('2021-01-01 00:00:00.000','Agent D',5 ),
('2021-02-02 00:00:00.000','Agent D',35),
('2021-03-10 00:00:00.000','Agent D',31)
GO

SELECT [id],[date],[agent],[sales]
FROM [AgentSales]
GO
  • Related