I am trying to outer join multiple time series tables in PostgreSQL on multiple conditions - which include the date column and several other identifier columns.
However the tables do not have continuous time series i.e. some dates are missing for some of the join conditions. Furthermore I don't want "duplicate" table specific new columns to be added for a row when there is not match
I have tried COALESCE() on the dates to fill in missing dates, which is fine. However it is the subsequent joins that are causing me problems. I also can't assume that one of the tables will have rows for all the dates required.
I thought perhaps to use generate series for a date range, with empty columns (? if possible) and then join all the tables on to that?
Please see example below:
I want to join Table A and Table B on columns date, identifier_1 and identifier_2 as an outer join. However where a value is not matched I do not want new columns to be added e.g. table_b_identifier_1.
Table A
id1 and id2 are missing rows on the 03/07 and 04/07, and id1 is also missing a row for the 05/07.
Table B
id2 is missing a row on the 02/07
Desired Output:
Essentially it is a conditional join. If there is a row in both tables for identifier_1 and
CodePudding user response:
It's not clear what is wrong with your attempt to use COALESCE
to fill columns data, but it works well as intended in such a query
SELECT
COALESCE(a.date, b.date) AS date,
COALESCE(a.identifier_1, b.identifier_1) AS identifier_1,
COALESCE(a.identifier_2, b.identifier_2) AS identifier_2,
a.value_a,
b.value_b
FROM table_a a
FULL JOIN table_b b ON a.date = b.date
AND a.identifier_1 = b.identifier_1
AND a.identifier_2 = b.identifier_2
Please, check a demo