I have a SQLite
table of security prices that have timestamps from when each price was recorded. I'm trying to write a query were I can pass an arbitrary price_datetime
input, say 2022-08-10 19:000:00.000000
, and pull the latest price
for each security_id
that is <=
the inputed price_datetime
.
Below is a sample table:
┌─────────────┬───────┬────────────────────────────┐
│ security_id │ price │ price_datetime │
├─────────────┼───────┼────────────────────────────┤
│ 4 │ 90.0 │ 2022-08-11 20:00:00.000000 │
│ 1 │ 100.0 │ 2022-08-10 20:00:00.000000 │
│ 1 │ 90.0 │ 2022-08-10 19:00:00.000000 │
│ 2 │ 95.0 │ 2022-08-10 18:00:00.000000 │
│ 2 │ 90.0 │ 2022-08-09 20:00:00.000000 │
│ 1 │ 90.0 │ 2022-08-09 20:00:00.000000 │
│ 3 │ 90.0 │ 2022-08-08 20:00:00.000000 │
└─────────────┴───────┴────────────────────────────┘
The input 2022-08-10 19:000:00.000000
would return below:
┌─────────────┬───────┬────────────────────────────┐
│ security_id │ price │ price_datetime │
├─────────────┼───────┼────────────────────────────┤
│ 4 │ null │ null │
│ 1 │ 90.0 │ 2022-08-10 19:00:00.000000 │
│ 2 │ 95.0 │ 2022-08-10 18:00:00.000000 │
│ 3 │ 90.0 │ 2022-08-08 20:00:00.000000 │
└─────────────┴───────┴────────────────────────────┘
security_id
4 returns null
since there is no price record with a price_datetime
<= 2022-08-10 19:000:00.000000
. The query should also have the flexibility to return the last 1, 2, etc. records <=
the inputted price_datetime
in addition to just the last.
Code to generate the sample table:
CREATE TABLE prices (
security_id INTEGER,
price FLOAT NOT NULL,
price_datetime DATETIME NOT NULL
);
CREATE INDEX ix_prices_price_datetime on "prices" (price_datetime);
INSERT INTO prices VALUES(1,100,'2022-08-10 20:00:00.000000'),(1,90,'2022-08-10 19:00:00.000000'),(1,90,'2022-08-09 20:00:00.000000'),(2,90,'2022-08-09 20:00:00.000000'),(2,95,'2022-08-10 18:00:00.000000'),(3,90,'2022-08-08 20:00:00.000000'),(4,90,'2022-08-11 20:00:00.000000');
SELECT * FROM prices ORDER BY price_datetime DESC;
CodePudding user response:
WITH
query_input(price_datetime, row_count) AS (
VALUES (
'2022-08-10 19:000:00.000000', -- @price_datetime,
2 -- @row_count
)
),
-- Remove this CTE when running against your db containing the actual "prices" table.
prices(security_id, price, price_datetime) AS (
VALUES
(1, 99, '2022-08-10 20:00:00.000000'),
(1, 90, '2022-08-10 19:00:00.000000'),
(1, 90, '2022-08-09 20:00:00.000000'),
(2, 90, '2022-08-09 20:00:00.000000'),
(2, 95, '2022-08-10 18:00:00.000000'),
(3, 90, '2022-08-08 20:00:00.000000'),
(4, 90, '2022-08-11 20:00:00.000000')
),
-- As you requested to show every security, you need to collect all id's.
-- I assume you define table "securities" containing column "id", so that
-- prices.security_id => securities.id. Remove the following query when
-- running against your db and adjust the "ids" CTE below if necessary.
securities(id) AS (VALUES (1), (2), (3), (4)),
ids(id) AS (
-- Adjust "id" and "securities" to match your schema
SELECT id FROM securities
),
filtered_prices AS (
SELECT prices.*
FROM prices, query_input
WHERE prices.price_datetime <= query_input.price_datetime
),
label_prices AS (
SELECT
fp.*,
row_number() OVER (PARTITION BY fp.security_id ORDER BY fp.price_datetime DESC) AS group_id
FROM filtered_prices AS fp
),
last_n_existing_prices AS (
SELECT lp.*
FROM label_prices AS lp, query_input AS qi
WHERE lp.group_id <= qi.row_count
),
last_n_prices AS (
SELECT ids.id AS security_id, lp.price, lp.price_datetime
FROM ids
LEFT JOIN last_n_existing_prices AS lp
ON ids.id = lp.security_id
ORDER BY ids.id, lp.price_datetime DESC
)
SELECT * FROM last_n_prices;
CodePudding user response:
You need a LEFT
join of the distinct security_id
s to the table.
For the last price of each security_id
the simplest way to get it is with aggregation, using SQLite's feature of bare columns:
SELECT s.security_id, p.price, MAX(p.price_datetime) price_datetime
FROM (SELECT DISTINCT security_id FROM prices) s
LEFT JOIN prices p ON p.security_id = s.security_id AND p.price_datetime <= '2022-08-10 19:000:00.000000'
GROUP BY s.security_id;
For the last n rows use ROW_NUMBER()
window function:
WITH cte AS (
SELECT s.security_id, p.price, p.price_datetime,
ROW_NUMBER() OVER (PARTITION BY s.security_id ORDER BY p.price_datetime DESC) rn
FROM (SELECT DISTINCT security_id FROM prices) s
LEFT JOIN prices p ON p.security_id = s.security_id AND p.price_datetime <= '2022-08-10 19:000:00.000000'
)
SELECT security_id, price, price_datetime
FROM cte
WHERE rn <= n; -- replace n with the number that you want
See the demo.