Get the most recent and not null row-CodePudding

I have the following table:

id	timestamp	attribute-1	attribute-2
1	8/12		A
1	8/13	B
2	8/12		A
2	8/13	C	B
2	8/14	B
3	8/12	B
3	8/14	C	C

And would like to create a new table with the most updated attributes for each id, which means the most recent row for each id as long as it has both att-1 and att-2, if it doesn't I want to take the att from a previous row. should be like that:

id	attribute-1	attribute-2
1	B	A
2	B	B
3	C	C

The timestamp is a real psql timestamp

CodePudding user response：

You can use FIRST_VALUE() window function:

SELECT DISTINCT id, 
       FIRST_VALUE(attribute1) OVER (PARTITION BY id ORDER BY attribute1 IS NULL, timestamp DESC) attribute1,
       FIRST_VALUE(attribute2) OVER (PARTITION BY id ORDER BY attribute2 IS NULL, timestamp DESC) attribute2
FROM tablename;

See the demo.

CodePudding user response：

I created a table with following data as per your input

CREATE TABLE data (
  "id" INTEGER,
  "timestamp" DATE,
  "attribute1" VARCHAR(1),
  "attribute2" VARCHAR(1)
);

INSERT INTO data
  ("id", "timestamp", "attribute1", "attribute2")
VALUES
  ('1', '2021-08-12', null, 'A'),
  ('1', '2021-08-13', 'B', null),
  ('2', '2021-08-12', null, 'A'),
  ('2', '2021-08-13', 'C', 'B'),
  ('2', '2021-08-14', 'B', null),
  ('3', '2021-08-12', 'B', null),
  ('3', '2021-08-14', 'C', 'C');

I think you can achieve your result by aggregating data and picking the first result in that:

SELECT 
    id, MAX(timestamp) AS timestamp_max,
    (array_remove(array_agg(attribute1 ORDER BY timestamp DESC), NULL))[1] AS attribute1_agg,
    (array_remove(array_agg(attribute2 ORDER BY timestamp DESC), NULL))[1] AS attribute1_agg
FROM data
GROUP BY id
ORDER BY id ASC;

which gives this output:

id	timestamp_max	attribute1_agg	attribute1_agg
1	2021-08-13T00:00:00.000Z	B	A
2	2021-08-14T00:00:00.000Z	B	B
3	2021-08-14T00:00:00.000Z	C	C