I have a table with telemetry and timestamps from a couple of devices (aka time series table) in PostgreSQL. (The table holds about 1M records.) I want to select the latest value of any parameter for any sensor (location_id
) with a single query.
My sample schema and data:
CREATE TABLE telemetry (
location_id VARCHAR(255) NOT NULL,
param VARCHAR(32) NOT NULL,
value NUMERIC NOT NULL,
timestamp TIMESTAMP NOT NULL DEFAULT NOW()
);
INSERT INTO telemetry (location_id, param, value, timestamp) VALUES
('001', 'temperature', 0, '2003-04-12 04:05:06'),
('001', 'humidity', 12, '2003-04-12 04:05:06'),
('002', 'temperature', 10, '2003-04-12 04:05:06'),
('002', 'humidity', 13, '2003-04-12 04:05:06'),
('001', 'temperature', 20, '2003-04-12 04:05:08'),
('001', 'humidity', 14, '2003-04-12 04:05:08'),
('002', 'temperature', 30, '2003-04-12 04:05:08')
My expected output:
location_id | temperature | humidity
001 | 20 | 14
002 | 30 | 13
I've played a lot with SELECT DISTINCT ON
, but after all I'm not sure if it's even possible.
CodePudding user response:
Determine the latest values first using ROW_NUMBER
combined by PARTITION BY
and ORDER BY
, use it to filter and then pivot the data:
WITH latest_values AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION by location_id, param ORDER BY timestamp DESC) row_num
FROM telemetry)
SELECT
location_id,
MAX(CASE WHEN param = 'temperature' THEN value END) AS temperature,
MAX(CASE WHEN param = 'humidity' THEN value END) AS humidity
FROM latest_values
WHERE row_num = 1
GROUP BY location_id