Customer table:
customer_id | customer_name |
---|---|
101 | Alice |
102 | Bob |
103 | Charlie |
Orders table:
order_id | sale_date | order_cost | customer_id | seller_id |
---|---|---|---|---|
1 | 2020-03-01 | 1500 | 101 | 1 |
2 | 2020-05-25 | 2400 | 102 | 2 |
3 | 2019-05-25 | 800 | 101 | 3 |
4 | 2020-09-13 | 1000 | 103 | 2 |
5 | 2019-02-11 | 700 | 101 | 2 |
Seller table:
seller_id | seller_name |
---|---|
1 | Daniel |
2 | Elizabeth |
3 | Frank |
Write an SQL query to report the names of all sellers who did not make any sales in 2020
This question was asked to me in a data analyst interview and I provided the query below. But my query is partially incorrect because it also gives those sellers who made sales in another year along with 2020 like Elizabeth because she made sales in 2019 and 2020. So I want a query in SQL which only gives the name of all sellers who made sales in 2020 only.
select s.seller_name
from seller_table s
JOIN order_table o
ON s.seller_id = o.seller_id
where year(sale_date) <> 2020;
CodePudding user response:
switch to a not in or not exists.
select seller_name from seller where id not in
(select seller_id from order_table where year(sale_date) = 2020)
CodePudding user response:
Your query selects all sellers that have an order in a year that is not 2020. This is not what you want.
You want to select sellers for which not exists an order in year 2020.
select s.seller_name
from seller_table s
where not exists
(
select null
from order_table o
where o.seller_id = s.seller_id
and year(o.sale_date) = 2020
);
There are other ways to achieve the same, but this is the most straight-forward.
CodePudding user response:
Two equivalent approaches:
select s.seller_name
from seller_table s
where not exists (select 1 from order_table where o.seller_id=s.seller_id and year(sale_date)=2020)
and
select s.seller_name
from seller_table s
left join order_table o on o.seller_id=s.seller_id and year(o.sale_date)=2020
where o.seller_id is null
The former expresses the intent of the query better; the latter better represents how the query will actually be executed.