Home > Mobile >  Getting a relative max for each value in a time series
Getting a relative max for each value in a time series

Time:01-29

So I have a table called optiontracking with ConID/TimeLog/Ask/Bid table that looks like the following:

ConID TimeLog Ask Bid
1 2020-06-03 10:18:43 0.52 0.55
1 2020-06-03 10:20:41 0.55 0.42
1 2020-06-03 10:23:54 0.60 0.58
1 2020-06-03 10:26:26 0.52 0.51
1 2020-06-03 10:28:57 0.51 0.50
1 2020-06-04 10:25:57 0.65 0.64
1 2020-06-04 10:26:57 0.80 0.79
2 2020-06-03 10:18:43 1.36 1.33
2 2020-06-03 10:20:41 1.38 1.35
2 2020-06-03 10:23:54 1.33 1.32
2 2020-06-03 10:26:26 1.25 1.22
2 2020-06-03 10:28:57 1.20 1.19

What I am trying to get is for each ConID/TimeLog/Ask entry in this table, chronologically next Max(Bid), TimeLog of Max(Bid) while matching the same day and ConID.

I expect to repeat this for Max Ask (but I figure that will need 1 subquery for each group).

So far I have been able to extract the max Bid chronologically after each TimeLog/ask in the dataset using the query below. I want to be able to pull the corresponding timestamp where the MAX(Bid) was retrieved from, but I am hitting a wall without using a second subquery (which I am trying to avoid due to data size).

Here is the query I am using to get the current max Bid per each Ask Chronologically:

SELECT DISTINCT OT.[ConID], OT.[TimeLog], OT.[Ask], MAX(MaxBid.[Bid]) as SellPrice from 
(
SELECT * from algotrading.[optiontracking] WHERE YEAR([TimeLog]) <= 2020 and Month([TimeLog]) <= 6
) as OT

LEFT JOIN 
(
SELECT [ConID], [Bid], [TimeLog], CONVERT(date, [TimeLog]) as CombDate from algotrading.[optiontracking] WHERE YEAR([TimeLog]) <= 2020 and Month([TimeLog]) <= 6
) as MaxBid 
on OT.[ConID] = MaxBid.[ConID] and MaxBid.[CombDate] = CONVERT(date, OT.[TimeLog]) and DATEDIFF(second, OT.[TimeLog], MaxBid.[TimeLog]) > 0

GROUP BY OT.[ConID], OT.[TimeLog], OT.[Ask], CONVERT(date, OT.[TimeLog])

I know there is a way to get the corresponding [TimeLog] for each Max([Bid]), but I can't quite seem to figure it out. The Year/Month filtering on each query is because the dataset is too large and I'm trying to do testing first.

Here is what I would expect as an output from the table above (including the TimeLog for each max bid listed as SellPriceTime below):

ConID TimeLog Ask SellPrice SellPriceTime
1 2020-06-03 10:18:43 0.52 0.58 2020-06-03 10:23:54
1 2020-06-03 10:20:41 0.55 0.58 2020-06-03 10:23:54
1 2020-06-03 10:23:54 0.60 0.51 2020-06-03 10:26:26
1 2020-06-03 10:26:26 0.52 0.50 2020-06-03 10:23:54
1 2020-06-03 10:28:57 0.51 NULL NULL
1 2020-06-04 10:25:57 0.65 0.79 2020-06-04 10:26:57
1 2020-06-04 10:26:57 0.80 NULL NULL
2 2020-06-03 10:18:43 1.36 1.35 2020-06-03 10:20:41
2 2020-06-03 10:20:41 1.38 1.32 2020-06-03 10:23:54
2 2020-06-03 10:23:54 1.33 1.22 2020-06-03 10:26:26
2 2020-06-03 10:26:26 1.25 1.19 2020-06-03 10:28:57
2 2020-06-03 10:28:57 1.20 NULL NULL

CodePudding user response:

You can use an OUTER APPLY(SELECT TOP 1 ...) to select the max bid row for each ask.

An OUTER APPLY is like a left join to a subselect, where that subselect can apply WHERE conditions, ORDER BY and in this case TOP 1 to get your desired bid. It also allows you to select extract multiple values (Bid and TimeLog in this case), which is an advantage over a simple MAX() function.

I adjusted the "same day" logic to use a condition that checks for TimeLog < start of the next day. The overall date condition in the outer where clause has also been adjusted to be a simple compare. The original logic would have selected the first 6 months in each year 2020 and earlier.

For performance, make sure that you have an index on algotrading.optiontracking(ConID, TimeLog).

SELECT OT.ConID, OT.TimeLog, OT.Ask, MaxBid.SellPrice, MaxBid.SellPriceTime
FROM algotrading.optiontracking as OT
OUTER APPLY (
    SELECT TOP 1 OT2.Bid AS SellPrice, OT2.TimeLog AS SellPriceTime
    FROM algotrading.optiontracking OT2
    WHERE OT2.ConID = OT.ConID
    AND OT2.TimeLog > OT.TimeLog -- Later
    AND OT2.TimeLog < DATEADD(day, 1, CONVERT(DATE, OT.TimeLog)) -- Before start of next day
    ORDER BY OT2.Bid DESC -- Max Bid
) MaxBid
WHERE OT.TimeLog < '2020-07-01' -- Is this what was intended?

Results:

ConID TimeLog Ask SellPrice SellPriceTime
1 2020-06-03 10:18:43.000 0.52 0.58 2020-06-03 10:23:54.000
1 2020-06-03 10:20:41.000 0.55 0.58 2020-06-03 10:23:54.000
1 2020-06-03 10:23:54.000 0.60 0.51 2020-06-03 10:26:26.000
1 2020-06-03 10:26:26.000 0.52 0.50 2020-06-03 10:28:57.000
1 2020-06-03 10:28:57.000 0.51 null null
1 2020-06-04 10:25:57.000 0.65 0.79 2020-06-04 10:26:57.000
1 2020-06-04 10:26:57.000 0.80 null null
2 2020-06-03 10:18:43.000 1.36 1.35 2020-06-03 10:20:41.000
2 2020-06-03 10:20:41.000 1.38 1.32 2020-06-03 10:23:54.000
2 2020-06-03 10:23:54.000 1.33 1.22 2020-06-03 10:26:26.000
2 2020-06-03 10:26:26.000 1.25 1.19 2020-06-03 10:28:57.000
2 2020-06-03 10:28:57.000 1.20 null null

See this db<>fiddle.

CodePudding user response:

I don't fully understand what you're trying to do but this seems like a case for RANK() and PARTITION BY.

This is not a full answer but with

SELECT ConnId, TimeLog, LastPrice, Ask, Bid,
    ,Max(LastPrice) OVER (PARTITION BY ConnId) AS MaxLastPrice  
    ,RANK() OVER (PARTITION BY ConnId ORDER BY Bid DESC) AS Rank  
from ##Test

you'd get:

ConnId  TimeLog     LastPrice   Ask     Bid     MaxLastPrice    Rank
1   2020-06-04 10:26:57.0000000     10.48   0.80    0.79    10.48   1
1   2020-06-04 10:25:57.0000000     10.48   0.65    0.64    10.48   2
1   2020-06-03 10:23:54.0000000     10.45   0.60    0.58    10.48   3
1   2020-06-03 10:18:43.0000000     10.40   0.52    0.55    10.48   4
1   2020-06-03 10:26:26.0000000     10.42   0.52    0.51    10.48   5
1   2020-06-03 10:28:57.0000000     10.48   0.51    0.50    10.48   6
1   2020-06-03 10:20:41.0000000     10.30   0.55    0.42    10.48   7
2   2020-06-03 10:20:41.0000000     12.05   1.38    1.35    12.15   1
2   2020-06-03 10:18:43.0000000     12.10   1.36    1.33    12.15   2
2   2020-06-03 10:23:54.0000000     12.03   1.33    1.32    12.15   3
2   2020-06-03 10:26:26.0000000     12.13   1.25    1.22    12.15   4
2   2020-06-03 10:28:57.0000000     12.15   1.20    1.19    12.15   5

You can now order by whichever column you wish and pick the records with rank 1, for example.


Here's my test data:

CREATE TABLE ##test (
    ConnId INT,
    TimeLog DATETIME2,
    LastPrice Numeric(5,2),
  Ask Numeric(5,2),
  Bid Numeric(5,2)
);


Insert into ##test values
(1, '2020-06-03 10:18:43 ', 10.40, 0.52, 0.55),
(1, '2020-06-03 10:20:41 ', 10.30, 0.55, 0.42),
(1, '2020-06-03 10:23:54 ', 10.45, 0.60, 0.58),
(1, '2020-06-03 10:26:26 ', 10.42, 0.52, 0.51),
(1, '2020-06-03 10:28:57 ', 10.48, 0.51, 0.50),
(1, '2020-06-04 10:25:57 ', 10.48, 0.65, 0.64),
(1, '2020-06-04 10:26:57 ', 10.48, 0.80, 0.79),
(2, '2020-06-03 10:18:43 ', 12.10, 1.36, 1.33),
(2, '2020-06-03 10:20:41 ', 12.05, 1.38, 1.35),
(2, '2020-06-03 10:23:54 ', 12.03, 1.33, 1.32),
(2, '2020-06-03 10:26:26 ', 12.13, 1.25, 1.22),
(2, '2020-06-03 10:28:57 ', 12.15, 1.20, 1.19)

CodePudding user response:

I was able to solve this with the way I knew how, and it is faster than the current "better" solution. I would much rather use @T N's answer which is easier to understand and seems way more flexible, but this is query is considerably faster (3.5 mins vs. 8 mins).

For some extra information:

  • There is roughly 8 million records in this table for data from 6/1/2020 to 7/1/2020
  • There were no indexes created yet to help with any of these queries.

Any guidance as to why this is a significantly faster query than @T N's answer?

SELECT Prof.[ConID], Prof.[PurchaseTime], Prof.[Ask], Prof.[MaxPriceAfterPurchase], Prof.[MaxPriceAfterPurchase]-Prof.[Ask] as MaxProfit, MIN(SellTimeMax.TimeLog) as SellAtMaxTime
FROM 
(
  SELECT DISTINCT OT.[ConID], OT.[TimeLog] as [PurchaseTime], OT.[Ask], MAX(MaxBid.[Bid]) as MaxPriceAfterPurchase
  from (
  SELECT * from algotrading.[optiontracking] WHERE TimeLog < '2020-07-01'
  ) OT

  LEFT JOIN (
    SELECT [ConID], [Bid], [TimeLog], CONVERT(date, [TimeLog]) as CombDate from algotrading.[optiontracking] WHERE TimeLog < '2020-07-01'
    ) MaxBid 
    on OT.[ConID] = MaxBid.[ConID] and MaxBid.[CombDate] = CONVERT(date, OT.[TimeLog]) and DATEDIFF(second, OT.[TimeLog], MaxBid.[TimeLog]) > 0

  GROUP BY OT.[ConID], OT.[TimeLog], OT.[Ask]
) Prof
LEFT JOIN 
(
SELECT [ConID], [Bid], [TimeLog], CONVERT(date, [TimeLog]) as CombDate from algotrading.[optiontracking] WHERE TimeLog < '2020-07-01'
) SellTimeMax
ON SellTimeMax.[ConID] = Prof.ConID and SellTimeMax.CombDate = CONVERT(date, Prof.[PurchaseTime]) and SellTimeMax.[Bid] = Prof.[MaxPriceAfterPurchase] and DATEDIFF(second, Prof.[PurchaseTime], SellTimeMax.[TimeLog]) > 0
  • Related