I have a table in a PostgreSQL database with data like:
id customer_id item value timestamp
1 001 price 1000 11/1/2021
2 001 price 1500 11/2/2021
3 001 condition good 11/3/2021
4 002 condition bad 11/4/2021
5 002 condition good 11/5/2021
6 002 price 1000 11/6/2021
7 001 condition good 11/7/2021
8 001 price 1400 11/8/2021
9 002 price 1500 11/9/2021
10 001 condition ok 11/10/2021
11 002 price 1600 11/11/2021
12 002 price 1550 11/12/2021
From this table, I want to query the latest value by date and convert it to a table as below.
customer_id price condition
001 1400 ok
002 1550 good
To get this kind of table I tried with this below query, but it did not work well when there are too much data. (Operation like Min and Max to text and number)
I tested this in pgAdmin 4:
SELECT customer_id,
MAX (Case WHEN item='price' THEN value END) price,
MAX (Case WHEN item='condition' THEN value END) condition
FROM table_name GROUP BY customer_id
I want to query the value by the latest date updated data.
CodePudding user response:
Your relational design might be improved. It's an anti-pattern to mix different types of data in the same column.
While stuck with the given setup, two subqueries with DISTINCT ON
and a FULL OUTER JOIN
do the job:
SELECT customer_id, p.value AS price, c.value AS condition
FROM (
SELECT DISTINCT ON (customer_id)
customer_id, value
FROM tbl
WHERE item = 'condition'
ORDER BY customer_id, timestamp DESC
) c
FULL JOIN (
SELECT DISTINCT ON (customer_id)
customer_id, value
FROM tbl
WHERE item = 'price'
ORDER BY customer_id, timestamp DESC
) p USING (customer_id)
db<>fiddle here
See:
This assumes timestamp
to be defined NOT NULL
, or you'll want NULLS LAST
.
Depending on undisclosed cardinalities and value distribution, there may be (much) faster query variants.
If there is a customer
table with distinct customer_id
, (much) faster query styles become possible.
These partial, multicolumn indexes would be perfect to make it fast in any case:
CREATE INDEX tbl_condition_special_idx ON tbl (customer_id, timestamp DESC, value) WHERE item = 'condition';
CREATE INDEX tbl_price_special_idx ON tbl (customer_id, timestamp DESC, value) WHERE item = 'price';
See:
CodePudding user response:
SELECT X.CUSTOMER_ID,X.PRICE,X.CONDITION
FROM
(
SELECT A.CUSTOMER_ID,A.PRICE,A.CONDITION,
ROW_NUMBER()OVER(PARTITION BY A.CUSTOMER_ID ORDER BY A.TIMESTAMP DESC)XCOL
FROM YOUR_TABLE A
)X WHERE X.XCOL=1
Could you please try if the above is suitable for you