Today I've faced some unexplainable (for me) behavior in PostgreSQL — LEFT OUTER JOIN
does not return records for main table (with nulls for joined one fields) in case the joined table fields are used in WHERE
expression.
To make it easier to grasp the case details, I'll provide an example. So, let's say we have 2 tables: item
with some goods, and price
, referring item
, with prices for the goods in different years:
CREATE TABLE item(
id INTEGER PRIMARY KEY,
name VARCHAR(50)
);
CREATE TABLE price(
id INTEGER PRIMARY KEY,
item_id INTEGER NOT NULL,
year INTEGER NOT NULL,
value INTEGER NOT NULL,
CONSTRAINT goods_fk FOREIGN KEY (item_id) REFERENCES item(id)
);
The table item
has 2 records (TV set and VCR items), and the table price
has 3 records, a price for TV set in years 2000 and 2010, and a price for VCR for year 2000 only:
INSERT INTO item(id, name)
VALUES
(1, 'TV set'),
(2, 'VCR');
INSERT INTO price(id, item_id, year, value)
VALUES
(1, 1, 2000, 290),
(2, 1, 2010, 270),
(3, 2, 2000, 770);
-- no price of VCR for 2010
Now let's make a LEFT OUTER JOIN
query, to get prices for all items for year 2010:
SELECT
i.*,
p.year,
p.value
FROM item i
LEFT OUTER JOIN price p ON i.id = p.item_id
WHERE p.year = 2010 OR p.year IS NULL;
For some reason, this query will return a results only for TV set, which has a price for this year. Record for VCR is absent in results:
id | name | year | value
---- -------- ------ -------
1 | TV set | 2010 | 270
(1 row)
After some experimenting, I've found a way to make the query to return results I need (all records for item
table, with nulls in the fields of joined table in case there are no mathing records for the year. It was achieved by moving year filtering into a JOIN
condition:
SELECT
i.*,
p.year,
p.value
FROM item i
LEFT OUTER JOIN (
SELECT * FROM price
WHERE year = 2010 -- <= here I filter a year
) p ON i.id = p.item_id;
And now the result is:
id | name | year | value
---- -------- ------ -------
1 | TV set | 2010 | 270
2 | VCR | |
(2 rows)
My main question is — why the first query (with year filtering in WHERE
) does not work as expected, and turns instead into something like INNER JOIN
?
I'm severely blocked by this issue on my current project, so I'll be thankful about tips/hints on the next related questions too:
- Are there any other options to achieve the proper results?
- ... especially — easily translatable to Django's ORM queryset?
Upd: @astentx suggested to move filtering condition directly into JOIN
(and it works too):
SELECT
i.*,
p.year,
p.value
FROM item i
LEFT OUTER JOIN price p
ON
i.id = p.item_id
AND p.year = 2010;
Though, the same as my first solution, I don't see how to express it in terms of Django ORM querysets. Are there any other suggestions?
CodePudding user response:
The first query does not work as expected because expectation is wrong. It does not work as INNER JOIN as well. The query returns a record for VCR only if there is no price for VCR at all.
CodePudding user response:
SELECT
i.*,
y.year,
p.value
FROM item i
CROSS JOIN (SELECT 2010 AS year) y -- here could be a table
LEFT OUTER JOIN price p
ON (p.item_id = i.id
AND p.year = y.year);