I have two simple tables:
Table #1 apples_consumption
report_date | apples_consumed |
---|---|
2022-01-01 | 5 |
2022-02-01 | 7 |
2022-03-01 | 2 |
Table #2 hotel_visitors
visitor_id | check_in_date | check_out_date |
---|---|---|
1 | 2021-12-01 | 2022-02-01 |
2 | 2022-01-01 | NULL |
3 | 2022-02-01 | NULL |
4 | 2022-03-01 | NULL |
My purpose is to get a table which shows the ratio between number of visitors in the hotel to number of apples consumed by that time.
For the example above the desired query output should look like this:
report_date | visitors_count | apples_consumed |
---|---|---|
2022-01-01 | 2 -->(visitors #1, #2) | 5 |
2022-02-01 | 3 -->(visitors #1, #2, #3) | 7 |
2022-03-01 | 3 -->(visitors #2, #3, #4) | 2 |
If I were to write a solution to this task using code I would go over each report_date
from the apples_consumption
table and count how many visitors have a lower/equal check_in_date
than that report_date
and also have a check_out_date
= NULL or check_out_date
greater/equal than that report_date
I came up with this query:
select
ac.report_date,
ac.apples_consumed,
(
select count(*)
from hotel_visitors hv
where
hv.check_in_date <= ac.report_date and
(hv.check_out_date is null or hv.check_out_date >= ac.report_date
) as visitors_count
from
apples_consumptions ac
order by
ac.report_date
The query above works but it is very inefficient (I can see its relatively long execution time for larger datasets and by the way its written [it runs the inner count(*) query for as many rows as the outer apples_consuptions
table has)
I am looking for a more efficient way to achieve this result and your help will be highly appreciated!
CodePudding user response:
It is very rarely a good idea to put a subselect in your select list.
Join your tables and then use an aggregate count:
select a.report_date, count(v.visitor_id) as visitors_count, a.apples_consumed
from apples_consumption a
left join hotel_visitors v
on a.report_date
between v.check_in_date
and coalesce(v.check_out_date, '9999-12-31')
group by a.report_date, a.apples_consumed
order by a.report_date;
db<>fiddle here