Home > Net >  PostgreSQL CROSS JOIN but do not fill in "missing" values
PostgreSQL CROSS JOIN but do not fill in "missing" values

Time:03-23

Is there a Postgres join command that joins every single row from the left table to each individual row in the right table, but doesn't fill in "missing values"? Something that feels kind of like a cross between a LEFT JOIN and a CROSS JOIN...

Here's a minimum working example since it's a bit challenging to explain:

DROP TABLE temp_conditions;  -- if required
DROP TABLE temp_names;  -- if required

CREATE TABLE temp_conditions (
    tc_id serial,
    well_id int,
    condition_id int,
    raw_value varchar
);

CREATE TABLE temp_names (
    tn_id serial,
    name varchar
);

INSERT INTO
    temp_conditions (well_id, condition_id, raw_value)
VALUES
    (100, 1, '11'), -- well 100
    (100, 2, '22'), -- well 100
    (200, 1, '33'),
    (200, 2, '44'),
    (300, 3, '55'),
    (300, 4, '66');

INSERT INTO
    temp_names (name)
VALUES
    ('cond_a'),
    ('cond_b'),
    ('cond_c'),
    ('cond_d');

-- CROSS JOIN --

SELECT *
FROM temp_names
CROSS JOIN temp_conditions;

-- LEFT JOIN --

SELECT *
FROM temp_names tn
LEFT JOIN temp_conditions tc on tc.tc_id = tn.tn_id
ORDER BY well_id;

You can see from the code snippet that "well 100" has 11 units of condition 1 (aka cond_a) and 22 units of condition 2 (aka cond_b). However, in the CROSS JOIN snippet below, you can see that well 100 is showing the same raw values for every experimental condition (1, 2, 3 and 4).

CROSS JOIN (TRUNCATED)

tn_id |  name  | tc_id | well_id | condition_id | raw_value 
------- -------- ------- --------- -------------- -----------
     1 | cond_a |     1 |     100 |            1 | 11
     1 | cond_a |     2 |     100 |            2 | 22
     2 | cond_b |     1 |     100 |            1 | 11
     2 | cond_b |     2 |     100 |            2 | 22
     3 | cond_c |     1 |     100 |            1 | 11
     3 | cond_c |     2 |     100 |            2 | 22
     4 | cond_d |     1 |     100 |            1 | 11
     4 | cond_d |     2 |     100 |            2 | 22

And in the LEFT JOIN, not every condition has a record for well 100:

LEFT JOIN (TRUNCATED)

tn_id |  name  | tc_id | well_id | condition_id | raw_value 
------- -------- ------- --------- -------------- -----------
     1 | cond_a |     1 |     100 |            1 | 11
     2 | cond_b |     2 |     100 |            2 | 22

What I'm instead hoping to achieve is:

tn_id  |  name  | tc_id | well_id | condition_id | raw_value 
-------- -------- ------- --------- -------------- -----------
      1 | cond_a |     1 |     100 |            1 | 11
      2 | cond_b |     2 |     100 |            2 | 22
      3 | cond_c |  null |     100 |         null | null
      4 | cond_d |  null |     100 |         null | null

CodePudding user response:

I was able to resolve this using a CTE, but am leaving it open in case there are any other answers.

WITH cross_join_sq AS (
    SELECT DISTINCT tn.tn_id, tc1.well_id
    FROM temp_conditions tc1
    CROSS JOIN temp_names tn)
SELECT cjs.well_id, cjs.tn_id, tc.raw_value
FROM cross_join_sq cjs
LEFT JOIN temp_conditions tc ON tc.condition_id = cjs.tn_id
AND tc.well_id = cjs.well_id
ORDER BY cjs.well_id;
 well_id | tn_id | raw_value 
--------- ------- -----------
     100 |     1 | 11
     100 |     2 | 22
     100 |     3 | 
     100 |     4 | 
     200 |     1 | 33
     200 |     2 | 44
     200 |     3 | 
     200 |     4 | 
     300 |     1 | 
     300 |     2 | 
     300 |     3 | 55
     300 |     4 | 66
(12 rows)

CodePudding user response:

I think you just want to add a condition to the ON of the LEFT JOIN. The rows filtered out by this condition still included as NULL-extended rows.

SELECT *                 
FROM temp_names tn
LEFT JOIN temp_conditions tc on tc.tc_id = tn.tn_id and well_id=100
ORDER BY well_id;
 tn_id |  name  | tc_id  | well_id | condition_id | raw_value 
------- -------- -------- --------- -------------- -----------
     1 | cond_a |      1 |     100 |            1 | 11
     2 | cond_b |      2 |     100 |            2 | 22
     3 | cond_c | (null) |  (null) |       (null) | (null)
     4 | cond_d | (null) |  (null) |       (null) | (null)

The well_id is NULL, not 100. You could fix it with a coalesce if you wanted, but why select it at all if you already specified what it must be?

  • Related