Home > Software engineering >  SQL - Return count of consecutive days where value was unchanged
SQL - Return count of consecutive days where value was unchanged

Time:01-19

I have a table like

date ticker Action
'2022-03-01' AAPL BUY
'2022-03-02' AAPL SELL.
'2022-03-03' AAPL BUY.
'2022-03-01' CMG SELL.
'2022-03-02' CMG HOLD.
'2022-03-03' CMG HOLD.
'2022-03-01' GPS SELL.
'2022-03-02' GPS SELL.
'2022-03-03' GPS SELL.

I want to do a group by ticker then count all the times that Actions have sequentially been the value that they are as of the last date, here it's 2022-03-03. ie for this example table it'd be like;

ticker NumSequentialDaysAction
AAPL 0
CMG 1
GPS 2

Fine to pass in 2022-03-03 as a value, don't need to figure that out on the fly.

Tried something like this

---Table Creation---
CREATE TABLE UserTable
    ([Date] DATETIME2, [Ticker] varchar(5), [Action] varchar(5))
;
    
INSERT INTO UserTable
    ([Date], [Ticker], [Action])
VALUES
    ('2022-03-01' , 'AAPL' , 'BUY'),
    ('2022-03-02' , 'AAPL' , 'SELL'),
    ('2022-03-03' , 'AAPL' , 'BUY'),
    ('2022-03-01' , 'CMG' , 'SELL'),
    ('2022-03-02' , 'CMG' , 'HOLD'),
    ('2022-03-03' , 'CMG' , 'HOLD'),
    ('2022-03-01' , 'GPS' , 'SELL'),
    ('2022-03-02' , 'GPS' , 'SELL'),
    ('2022-03-03' , 'GPS' , 'SELL')
;

---Attempted Solution--- I'm thinking that I need to do a sub query to get the last value and join on itself to get the matching values. Then apply a window function, ordered by date to see that the proceeding value is sequential.

WITH CTE AS (SELECT Date, Ticker, Action,
                    ROW_NUMBER() OVER (PARTITION BY Ticker, Action ORDER BY Date) as row_num
             FROM UserTable)
SELECT Ticker, COUNT(DISTINCT Date) as count_of_days
FROM CTE
WHERE row_num = 1
GROUP BY Ticker;

WITH CTE AS (SELECT Date, Ticker, Action, DENSE_RANK() OVER (PARTITION BY Ticker ORDER BY Action,Date) as rank FROM table) SELECT Ticker, COUNT(DISTINCT Date) as count_of_days FROM CTE WHERE rank = 1 GROUP BY Ticker;

CodePudding user response:

You can do this with the help of the LEAD function like so. You didn't specify which RDBMS you're using. This solution works in PostgreSQL:

WITH "withSequential" AS (
  SELECT
    ticker,
    (LEAD("Action") OVER (PARTITION BY ticker ORDER BY date ASC) = "Action") AS "nextDayIsSameAction"
  FROM UserTable
)
SELECT
  ticker,
  SUM(
    CASE
      WHEN "nextDayIsSameAction" IS TRUE THEN 1
      ELSE 0
    END
  ) AS "NumSequentialDaysAction"
FROM "withSequential"
GROUP BY ticker

CodePudding user response:

Here is a way to do this using gaps and islands solution.

Thanks for sharing the create and insert scripts, which helps to build the solution quickly.

dbfiddle link. https://dbfiddle.uk/rZLDTrNR

with data
  as (
select date
      ,ticker
      ,action
       ,case when lag(action) over(partition by ticker order by date) <> action then 
                  1
             else 0 
         end as marker
  from usertable
      )
  ,interim_data
   as (
 select *
       ,sum(marker) over(partition by ticker order by date) as grp_val
  from data
      )
  ,interim_data2
  as ( 
 select *
       ,count(*) over(partition by ticker,grp_val) as NumSequentialDaysAction
  from interim_data
      )
select ticker,NumSequentialDaysAction
  from interim_data2
where date='2022-03-03'

CodePudding user response:

Another option, you could use the difference between two row_numbers approach as the following:

select [Ticker], count(*)-1 NumSequentialDaysAction -- you could use (distinct) to remove duplicate rows
from
(
  select *,
   row_number() over (partition by [Ticker] order by [Date]) -
   row_number() over (partition by [Ticker], [Action]  order by [Date]) grp
  from UserTable 
  where [date] <= '2022-03-03'
) RN_Groups
/* get only rows where [Action] = last date [Action] */
where [Action] = (select top 1 [Action] from UserTable T 
                  where T.[Ticker] = RN_Groups.[Ticker] and [date] <= '2022-03-03' 
                  order by [Date] desc)
group by [Ticker], [Action], grp

See demo

  • Related