I have a table with historical stocks prices for hundreds of stocks. I need to extract only those stocks that reached $10 or greater for the first time.
Stock | Price | Date |
---|---|---|
AAA | 9 | 2021-10-01 |
AAA | 10 | 2021-10-02 |
AAA | 8 | 2021-10-03 |
AAA | 10 | 2021-10-04 |
BBB | 9 | 2021-10-01 |
BBB | 11 | 2021-10-02 |
BBB | 12 | 2021-10-03 |
Is there a way to count how many times each stock hit >= 10 in order to pull only those where count = 1 (in this case it would be stock BBB considering it never reached 10 in the past)?
Since I couldn't figure how to create count I've tried the below manipulations with min/max dates but this looks like a bit awkward approach. Any idea of a simpler solution?
with query1 as (
select Stock, min(date) as min_greater10_dt
from t
where Price >= 10
group by Stock
), query2 as (
select Stock, max(date) as max_greater10_dt
from t
where Price >= 10
group by Stock
)
select Stock
from t a
join query1 b on b.Stock = a.Stock
join query2 c on c.Stock = a.Stock
where not(a.Price < 10 and a.Date between b.min_greater10_dt and c.max_greater10_dt)
CodePudding user response:
This is a type of gaps-and-islands problem which can be solved as follows:
- detect the change from < 10 to >= 10 using a lagged price
- count the number of such changes
- filter in only stock where this has happened exactly once
- and take the first row since you only want the stock (you could group by here but a row number allows you to select the entire row should you wish to).
declare @Table table (Stock varchar(3), Price money, [Date] date);
insert into @Table (Stock, Price, [Date])
values
('AAA', 9, '2021-10-01'),
('AAA', 10, '2021-10-02'),
('AAA', 8, '2021-10-03'),
('AAA', 10, '2021-10-04'),
('BBB', 9, '2021-10-01'),
('BBB', 11, '2021-10-02'),
('BBB', 12, '2021-10-03');
with cte1 as (
select Stock, Price, [Date]
, row_number() over (partition by Stock, case when Price >= 10 then 1 else 0 end order by [Date] asc) rn
, lag(Price,1,0) over (partition by Stock order by [Date] asc) LaggedStock
from @Table
), cte2 as (
select Stock, Price, [Date], rn, LaggedStock
, sum(case when Price >= 10 and LaggedStock < 10 then 1 else 0 end) over (partition by Stock) StockOver10
from cte1
)
select Stock
--, Price, [Date], rn, LaggedStock, StockOver10 -- debug
from cte2
where Price >= 10
and StockOver10 = 1 and rn = 1;
Returns:
Stock |
---|
BBB |
Note: providing DDL DML as show above makes it much easier of people to assist.