Home > Software design >  Get the most recent and not null row
Get the most recent and not null row

Time:11-15

I have the following table:

id timestamp attribute-1 attribute-2
1 8/12 A
1 8/13 B
2 8/12 A
2 8/13 C B
2 8/14 B
3 8/12 B
3 8/14 C C

And would like to create a new table with the most updated attributes for each id, which means the most recent row for each id as long as it has both att-1 and att-2, if it doesn't I want to take the att from a previous row. should be like that:

id attribute-1 attribute-2
1 B A
2 B B
3 C C
  • The timestamp is a real psql timestamp

CodePudding user response:

You can use FIRST_VALUE() window function:

SELECT DISTINCT id, 
       FIRST_VALUE(attribute1) OVER (PARTITION BY id ORDER BY attribute1 IS NULL, timestamp DESC) attribute1,
       FIRST_VALUE(attribute2) OVER (PARTITION BY id ORDER BY attribute2 IS NULL, timestamp DESC) attribute2
FROM tablename;

See the demo.

CodePudding user response:

I created a table with following data as per your input

CREATE TABLE data (
  "id" INTEGER,
  "timestamp" DATE,
  "attribute1" VARCHAR(1),
  "attribute2" VARCHAR(1)
);

INSERT INTO data
  ("id", "timestamp", "attribute1", "attribute2")
VALUES
  ('1', '2021-08-12', null, 'A'),
  ('1', '2021-08-13', 'B', null),
  ('2', '2021-08-12', null, 'A'),
  ('2', '2021-08-13', 'C', 'B'),
  ('2', '2021-08-14', 'B', null),
  ('3', '2021-08-12', 'B', null),
  ('3', '2021-08-14', 'C', 'C');

I think you can achieve your result by aggregating data and picking the first result in that:

SELECT 
    id, MAX(timestamp) AS timestamp_max,
    (array_remove(array_agg(attribute1 ORDER BY timestamp DESC), NULL))[1] AS attribute1_agg,
    (array_remove(array_agg(attribute2 ORDER BY timestamp DESC), NULL))[1] AS attribute1_agg
FROM data
GROUP BY id
ORDER BY id ASC;

which gives this output:

id timestamp_max attribute1_agg attribute1_agg
1 2021-08-13T00:00:00.000Z B A
2 2021-08-14T00:00:00.000Z B B
3 2021-08-14T00:00:00.000Z C C
  • Related