I have two tables that I'd like do a full outer join
where the resulting view separates the values table into two separate columns with one row for each name_id
. I have made one approach with a CASE
expression to select by type and then use it with pandas to fill in the values and return distinct name_ids.
Name Table
name_id | name |
---|---|
1 | foo |
2 | bar |
3 | doo |
4 | sue |
Values Table
name_id | value | type |
---|---|---|
1 | 90 | red |
2 | 95 | blue |
3 | 33 | red |
3 | 35 | blue |
4 | 60 | blue |
4 | 20 | red |
This is a condensed version. In my full table, I need to do this twice with two separate value tables sorted by type, red/blue and control/placebo.
Simple Join
SELECT names_table.name_id, name, value, type
FULL OUTER JOIN values_table
ON names_table.name_id = values_table.name_id
WHERE type in ('red', 'blue')
name_id | name | value | type |
---|---|---|---|
1 | foo | 90 | red |
2 | bar | 95 | blue |
3 | doo | 33 | red |
3 | doo | 35 | blue |
4 | sue | 60 | blue |
4 | sue | 20 | red |
Current work around result which I then fix with python and pandas
SELECT names_table.name_id, name, value, type
CASE
WHEN type = 'red' THEN value END red,
CASE
WHEN type = 'blue' THEN value END blue
FROM names_table
FULL OUTER JOIN values_table
ON names_table.name_id = values_table.name_id
name_id | name | blue | red |
---|---|---|---|
1 | foo | Null | 90 |
2 | bar | 95 | Null |
3 | doo | 35 | Null |
3 | doo | Null | 33 |
4 | sue | 60 | Null |
4 | sue | Null | 20 |
This is my desired output below, where I would have the types as columns and just rows for unique name_ids but with value tables 1 and 2.
Desired Output
name_id | name | blue | red |
---|---|---|---|
1 | foo | Null | 90 |
2 | bar | 95 | Null |
3 | doo | 35 | 33 |
4 | sue | 60 | 20 |
CodePudding user response:
I have two tables that I'd like do a
full outer join
...
Why would you? Better explain what you actually want to do instead of the assumed tool to implement it.
Simple pivoting with the aggregate FILTER
clause. See:
SELECT name_id, n.name, v.blue, v.red
FROM (
SELECT name_id
, min(value) FILTER (WHERE type = 'blue') AS blue
, min(value) FILTER (WHERE type = 'red') AS red
FROM values_table
GROUP BY 1
) v
LEFT JOIN names_table n USING (name_id);
Produces your desired result.
db<>fiddle here
The LEFT JOIN
includes result rows even if no name is found.
A FULL [OUTER] JOIN
would add names in the result that have no values at all. I think you really want a LEFT [OUTER] JOIN
or even a plain [INNER] JOIN
.
You can just switch the JOIN
type to adapt to your actual requirements. The identical column name "name_id" allows to join with a USING
clause. The unqualified name_id
in the outer SELECT
works for any join type.
Note how I aggregate first and join later. Typically substantially faster. See:
If there can be duplicate values for "red" or "blue", you'll have to define how to deal with those.
For more involved queries consider crosstab()
. See: