How to expand date ranges by taking into account overlapping days in PostgreSQL?-CodePudding

Given the following table created in a PostgreSQL database using the following SQL statements:

CREATE TABLE reservations (
    reservation_id INT PRIMARY KEY,
    check_in_date DATE NOT NULL,
    check_out_date DATE NOT NULL
);

INSERT INTO reservations VALUES
    (1,  '2021-01-04', '2021-01-05'),
    (2,  '2021-01-05', '2021-01-07'),
    (3,  '2021-01-06', '2021-01-07'),
    (4,  '2021-01-06', '2021-01-08'),
    (5,  '2021-01-08', '2021-01-09'),
    (6,  '2021-01-08', '2021-01-09'),
    (7,  '2021-01-08', '2021-01-10'),
    (8,  '2021-01-13', '2021-01-17'),
    (9,  '2021-01-13', '2021-01-14'),
    (10, '2021-01-14', '2021-01-15'),
    (11, '2021-01-15', '2021-01-16'),
    (12, '2021-01-16', '2021-01-17');

The table represents the list of reservations of a Bed&Breakfast with check-in date and check-out date.

It is requested to retrieve how many rooms should be set up and available for each week. In order to reduce maintenance expenses, it is required to minimise the number of offered rooms. Additionally a ,room can always be made available the same day as the check-out date. For example, if a room is checked out on the 2021-01-08 and another one is checked in on the same day, the count of rooms that needs to available for the 2021-01-08 will be only 1, and not 2.

The result should be the following:

week	number_of_rooms
1	3
2	2

My approach is the following:

WITH reservations_with_expanded_dates AS (
    SELECT
        reservation_id,
        Generate_series(check_in_date::DATE,
            check_out_date::DATE,
            '1day')::DATE AS stay
    FROM
        reservations
)
SELECT
    week,
    max(number_of_rooms)
FROM (
    SELECT
        Date_part('week', stay) AS week,
        stay,
        count(*) AS number_of_rooms
    FROM
        reservations_with_expanded_dates
    GROUP BY
        stay) AS reservations_per_day
GROUP BY
    week

Which gives the following result:

week	number_of_rooms
1	4
2	3

The result is not correct because my approach doesn't take into account the fact that a room can always be made available the same day as the check-out date. How can I improve my query to take this into account?

Moreover, is there a better way in terms of performance to achieve what I am trying to do?

CodePudding user response：

You can do:

select week, max(rooms_used) as max_rooms_used
from (
  select date_part('week', d) as week, count(*) as rooms_used
  from (
    select *, 
      generate_series(check_in_date,
                      check_out_date - 1,
                      interval '1 day')::date as d
    from reservations
  ) x
  group by d
) y
group by week

Result:

week  max_rooms_used
----  --------------
1     3
2     2

See running example at db<>fiddle.

CodePudding user response：

Something like this would work, it shows you per day how many rooms you need and for what reservation the room is. It's just an example that could be extended with late-check-out, different types of rooms, etc. etc.

CREATE TABLE reservations (
    reservation_id INT PRIMARY KEY,
    check_in_out tsrange
);

INSERT INTO reservations VALUES
    (1,  tsrange('2021-01-04 15:00', '2021-01-05 11:00','[)')),
    (2,  tsrange('2021-01-05 15:00', '2021-01-07 11:00','[)')),
    (3,  tsrange('2021-01-06 15:00', '2021-01-07 11:00','[)')),
    (4,  tsrange('2021-01-06 15:00', '2021-01-08 11:00','[)')),
    (5,  tsrange('2021-01-08 15:00', '2021-01-09 11:00','[)')),
    (6,  tsrange('2021-01-08 15:00', '2021-01-09 11:00','[)')),
    (7,  tsrange('2021-01-08 15:00', '2021-01-10 11:00','[)')),
    (8,  tsrange('2021-01-13 15:00', '2021-01-17 11:00','[)')),
    (9,  tsrange('2021-01-13 15:00', '2021-01-14 11:00','[)')),
    (10, tsrange('2021-01-14 15:00', '2021-01-15 11:00','[)')),
    (11, tsrange('2021-01-15 15:00', '2021-01-16 11:00','[)')),
    (12, tsrange('2021-01-16 15:00', '2021-01-17 11:00','[)'));
    
    
SELECT  s::date
    ,   EXTRACT(WEEK FROM s)
    ,   COUNT(DISTINCT reservation_id)
    ,   ARRAY_AGG(DISTINCT reservation_id ORDER BY reservation_id)
FROM    reservations
        RIGHT JOIN  (SELECT s FROM generate_series('2021-01-01 15:00'::timestamp, '2021-01-21'::timestamp, INTERVAL '1 DAY') g(s)) s ON s <@ check_in_out
GROUP BY 1,2
ORDER BY 1;

A late checkout will change the number of rooms needed:

UPDATE  reservations
SET     check_in_out = tsrange('2021-01-06 15:00', '2021-01-08 17:00','[)')
WHERE   reservation_id = 4;

Unless somebody else checks in late:

UPDATE  reservations
SET     check_in_out = tsrange('2021-01-08 18:00', '2021-01-10 11:00','[)')
WHERE   reservation_id = 7;

And you check for 18:00 hours how many rooms you need:

SELECT  s::date
    ,   EXTRACT(WEEK FROM s)
    ,   COUNT(DISTINCT reservation_id)
    ,   ARRAY_AGG(DISTINCT reservation_id ORDER BY reservation_id)
FROM    reservations
        RIGHT JOIN  (SELECT s FROM generate_series('2021-01-01 18:00'::timestamp, '2021-01-21'::timestamp, INTERVAL '1 DAY') g(s)) s ON s <@ check_in_out
GROUP BY 1,2
ORDER BY 1;