So I have a table called optiontracking with ConID/TimeLog/Ask/Bid table that looks like the following:
ConID | TimeLog | Ask | Bid |
---|---|---|---|
1 | 2020-06-03 10:18:43 | 0.52 | 0.55 |
1 | 2020-06-03 10:20:41 | 0.55 | 0.42 |
1 | 2020-06-03 10:23:54 | 0.60 | 0.58 |
1 | 2020-06-03 10:26:26 | 0.52 | 0.51 |
1 | 2020-06-03 10:28:57 | 0.51 | 0.50 |
1 | 2020-06-04 10:25:57 | 0.65 | 0.64 |
1 | 2020-06-04 10:26:57 | 0.80 | 0.79 |
2 | 2020-06-03 10:18:43 | 1.36 | 1.33 |
2 | 2020-06-03 10:20:41 | 1.38 | 1.35 |
2 | 2020-06-03 10:23:54 | 1.33 | 1.32 |
2 | 2020-06-03 10:26:26 | 1.25 | 1.22 |
2 | 2020-06-03 10:28:57 | 1.20 | 1.19 |
What I am trying to get is for each ConID/TimeLog/Ask entry in this table, chronologically next Max(Bid), TimeLog of Max(Bid) while matching the same day and ConID.
I expect to repeat this for Max Ask (but I figure that will need 1 subquery for each group).
So far I have been able to extract the max Bid chronologically after each TimeLog/ask in the dataset using the query below. I want to be able to pull the corresponding timestamp where the MAX(Bid) was retrieved from, but I am hitting a wall without using a second subquery (which I am trying to avoid due to data size).
Here is the query I am using to get the current max Bid per each Ask Chronologically:
SELECT DISTINCT OT.[ConID], OT.[TimeLog], OT.[Ask], MAX(MaxBid.[Bid]) as SellPrice from
(
SELECT * from algotrading.[optiontracking] WHERE YEAR([TimeLog]) <= 2020 and Month([TimeLog]) <= 6
) as OT
LEFT JOIN
(
SELECT [ConID], [Bid], [TimeLog], CONVERT(date, [TimeLog]) as CombDate from algotrading.[optiontracking] WHERE YEAR([TimeLog]) <= 2020 and Month([TimeLog]) <= 6
) as MaxBid
on OT.[ConID] = MaxBid.[ConID] and MaxBid.[CombDate] = CONVERT(date, OT.[TimeLog]) and DATEDIFF(second, OT.[TimeLog], MaxBid.[TimeLog]) > 0
GROUP BY OT.[ConID], OT.[TimeLog], OT.[Ask], CONVERT(date, OT.[TimeLog])
I know there is a way to get the corresponding [TimeLog] for each Max([Bid]), but I can't quite seem to figure it out. The Year/Month filtering on each query is because the dataset is too large and I'm trying to do testing first.
Here is what I would expect as an output from the table above (including the TimeLog for each max bid listed as SellPriceTime below):
ConID | TimeLog | Ask | SellPrice | SellPriceTime |
---|---|---|---|---|
1 | 2020-06-03 10:18:43 | 0.52 | 0.58 | 2020-06-03 10:23:54 |
1 | 2020-06-03 10:20:41 | 0.55 | 0.58 | 2020-06-03 10:23:54 |
1 | 2020-06-03 10:23:54 | 0.60 | 0.51 | 2020-06-03 10:26:26 |
1 | 2020-06-03 10:26:26 | 0.52 | 0.50 | 2020-06-03 10:23:54 |
1 | 2020-06-03 10:28:57 | 0.51 | NULL | NULL |
1 | 2020-06-04 10:25:57 | 0.65 | 0.79 | 2020-06-04 10:26:57 |
1 | 2020-06-04 10:26:57 | 0.80 | NULL | NULL |
2 | 2020-06-03 10:18:43 | 1.36 | 1.35 | 2020-06-03 10:20:41 |
2 | 2020-06-03 10:20:41 | 1.38 | 1.32 | 2020-06-03 10:23:54 |
2 | 2020-06-03 10:23:54 | 1.33 | 1.22 | 2020-06-03 10:26:26 |
2 | 2020-06-03 10:26:26 | 1.25 | 1.19 | 2020-06-03 10:28:57 |
2 | 2020-06-03 10:28:57 | 1.20 | NULL | NULL |
CodePudding user response:
You can use an OUTER APPLY(SELECT TOP 1 ...)
to select the max bid row for each ask.
An OUTER APPLY
is like a left join to a subselect, where that subselect can apply WHERE
conditions, ORDER BY
and in this case TOP 1
to get your desired bid. It also allows you to select extract multiple values (Bid
and TimeLog
in this case), which is an advantage over a simple MAX()
function.
I adjusted the "same day" logic to use a condition that checks for TimeLog < start of the next day. The overall date condition in the outer where clause has also been adjusted to be a simple compare. The original logic would have selected the first 6 months in each year 2020 and earlier.
For performance, make sure that you have an index on algotrading.optiontracking(ConID, TimeLog)
.
SELECT OT.ConID, OT.TimeLog, OT.Ask, MaxBid.SellPrice, MaxBid.SellPriceTime
FROM algotrading.optiontracking as OT
OUTER APPLY (
SELECT TOP 1 OT2.Bid AS SellPrice, OT2.TimeLog AS SellPriceTime
FROM algotrading.optiontracking OT2
WHERE OT2.ConID = OT.ConID
AND OT2.TimeLog > OT.TimeLog -- Later
AND OT2.TimeLog < DATEADD(day, 1, CONVERT(DATE, OT.TimeLog)) -- Before start of next day
ORDER BY OT2.Bid DESC -- Max Bid
) MaxBid
WHERE OT.TimeLog < '2020-07-01' -- Is this what was intended?
Results:
ConID | TimeLog | Ask | SellPrice | SellPriceTime |
---|---|---|---|---|
1 | 2020-06-03 10:18:43.000 | 0.52 | 0.58 | 2020-06-03 10:23:54.000 |
1 | 2020-06-03 10:20:41.000 | 0.55 | 0.58 | 2020-06-03 10:23:54.000 |
1 | 2020-06-03 10:23:54.000 | 0.60 | 0.51 | 2020-06-03 10:26:26.000 |
1 | 2020-06-03 10:26:26.000 | 0.52 | 0.50 | 2020-06-03 10:28:57.000 |
1 | 2020-06-03 10:28:57.000 | 0.51 | null | null |
1 | 2020-06-04 10:25:57.000 | 0.65 | 0.79 | 2020-06-04 10:26:57.000 |
1 | 2020-06-04 10:26:57.000 | 0.80 | null | null |
2 | 2020-06-03 10:18:43.000 | 1.36 | 1.35 | 2020-06-03 10:20:41.000 |
2 | 2020-06-03 10:20:41.000 | 1.38 | 1.32 | 2020-06-03 10:23:54.000 |
2 | 2020-06-03 10:23:54.000 | 1.33 | 1.22 | 2020-06-03 10:26:26.000 |
2 | 2020-06-03 10:26:26.000 | 1.25 | 1.19 | 2020-06-03 10:28:57.000 |
2 | 2020-06-03 10:28:57.000 | 1.20 | null | null |
See this db<>fiddle.
CodePudding user response:
I don't fully understand what you're trying to do but this seems like a case for RANK() and PARTITION BY
.
This is not a full answer but with
SELECT ConnId, TimeLog, LastPrice, Ask, Bid,
,Max(LastPrice) OVER (PARTITION BY ConnId) AS MaxLastPrice
,RANK() OVER (PARTITION BY ConnId ORDER BY Bid DESC) AS Rank
from ##Test
you'd get:
ConnId TimeLog LastPrice Ask Bid MaxLastPrice Rank
1 2020-06-04 10:26:57.0000000 10.48 0.80 0.79 10.48 1
1 2020-06-04 10:25:57.0000000 10.48 0.65 0.64 10.48 2
1 2020-06-03 10:23:54.0000000 10.45 0.60 0.58 10.48 3
1 2020-06-03 10:18:43.0000000 10.40 0.52 0.55 10.48 4
1 2020-06-03 10:26:26.0000000 10.42 0.52 0.51 10.48 5
1 2020-06-03 10:28:57.0000000 10.48 0.51 0.50 10.48 6
1 2020-06-03 10:20:41.0000000 10.30 0.55 0.42 10.48 7
2 2020-06-03 10:20:41.0000000 12.05 1.38 1.35 12.15 1
2 2020-06-03 10:18:43.0000000 12.10 1.36 1.33 12.15 2
2 2020-06-03 10:23:54.0000000 12.03 1.33 1.32 12.15 3
2 2020-06-03 10:26:26.0000000 12.13 1.25 1.22 12.15 4
2 2020-06-03 10:28:57.0000000 12.15 1.20 1.19 12.15 5
You can now order by whichever column you wish and pick the records with rank 1, for example.
Here's my test data:
CREATE TABLE ##test (
ConnId INT,
TimeLog DATETIME2,
LastPrice Numeric(5,2),
Ask Numeric(5,2),
Bid Numeric(5,2)
);
Insert into ##test values
(1, '2020-06-03 10:18:43 ', 10.40, 0.52, 0.55),
(1, '2020-06-03 10:20:41 ', 10.30, 0.55, 0.42),
(1, '2020-06-03 10:23:54 ', 10.45, 0.60, 0.58),
(1, '2020-06-03 10:26:26 ', 10.42, 0.52, 0.51),
(1, '2020-06-03 10:28:57 ', 10.48, 0.51, 0.50),
(1, '2020-06-04 10:25:57 ', 10.48, 0.65, 0.64),
(1, '2020-06-04 10:26:57 ', 10.48, 0.80, 0.79),
(2, '2020-06-03 10:18:43 ', 12.10, 1.36, 1.33),
(2, '2020-06-03 10:20:41 ', 12.05, 1.38, 1.35),
(2, '2020-06-03 10:23:54 ', 12.03, 1.33, 1.32),
(2, '2020-06-03 10:26:26 ', 12.13, 1.25, 1.22),
(2, '2020-06-03 10:28:57 ', 12.15, 1.20, 1.19)
CodePudding user response:
I was able to solve this with the way I knew how, and it is faster than the current "better" solution. I would much rather use @T N's answer which is easier to understand and seems way more flexible, but this is query is considerably faster (3.5 mins vs. 8 mins).
For some extra information:
- There is roughly 8 million records in this table for data from 6/1/2020 to 7/1/2020
- There were no indexes created yet to help with any of these queries.
Any guidance as to why this is a significantly faster query than @T N's answer?
SELECT Prof.[ConID], Prof.[PurchaseTime], Prof.[Ask], Prof.[MaxPriceAfterPurchase], Prof.[MaxPriceAfterPurchase]-Prof.[Ask] as MaxProfit, MIN(SellTimeMax.TimeLog) as SellAtMaxTime
FROM
(
SELECT DISTINCT OT.[ConID], OT.[TimeLog] as [PurchaseTime], OT.[Ask], MAX(MaxBid.[Bid]) as MaxPriceAfterPurchase
from (
SELECT * from algotrading.[optiontracking] WHERE TimeLog < '2020-07-01'
) OT
LEFT JOIN (
SELECT [ConID], [Bid], [TimeLog], CONVERT(date, [TimeLog]) as CombDate from algotrading.[optiontracking] WHERE TimeLog < '2020-07-01'
) MaxBid
on OT.[ConID] = MaxBid.[ConID] and MaxBid.[CombDate] = CONVERT(date, OT.[TimeLog]) and DATEDIFF(second, OT.[TimeLog], MaxBid.[TimeLog]) > 0
GROUP BY OT.[ConID], OT.[TimeLog], OT.[Ask]
) Prof
LEFT JOIN
(
SELECT [ConID], [Bid], [TimeLog], CONVERT(date, [TimeLog]) as CombDate from algotrading.[optiontracking] WHERE TimeLog < '2020-07-01'
) SellTimeMax
ON SellTimeMax.[ConID] = Prof.ConID and SellTimeMax.CombDate = CONVERT(date, Prof.[PurchaseTime]) and SellTimeMax.[Bid] = Prof.[MaxPriceAfterPurchase] and DATEDIFF(second, Prof.[PurchaseTime], SellTimeMax.[TimeLog]) > 0