Home > OS >  How to show clients with 0 reservations in certain year? (SQL)
How to show clients with 0 reservations in certain year? (SQL)

Time:09-15

I have these tables:

CREATE TABLE tour 
(
    id  bigserial NOT NULL,
    end_date DATE,
    initial_price float8 NOT NULL,
    start_date DATE,
    destination_id int8,
    guide_id int8,

    PRIMARY KEY (id)
);

CREATE TABLE client_data 
(
    id  bigserial NOT NULL,
    name VARCHAR(255),
    passport_number VARCHAR(255),
    surname VARCHAR(255),
    user_data_id int8,

    PRIMARY KEY (id)
);
 
CREATE TABLE reservation 
(
    id bigserial not null,
    actual_price float8 not null,
    client_id int8,
    tour_id int8,

    PRIMARY KEY (id)
);

Where every reservation is connected to client_data and tour.

My goal is to show all clients that has not made any reservation in certain year eg. clients that have no reservations in 2022.

I tried something like this:

SELECT client_data.name, reservation.id, COUNT(reservation.id)
FROM client_data
LEFT OUTER JOIN reservation ON client_data.id = reservation.client_id
LEFT OUTER JOIN tour ON tour.id = reservation.tour_id
GROUP BY client_data.name, reservation.id
HAVING COUNT(reservation.id) = 0;

Or this:

SELECT client_data.name, reservation.id, COUNT(reservation.id)
FROM client_data
LEFT OUTER JOIN reservation ON client_data.id = reservation.client_id
LEFT OUTER JOIN tour ON tour.id = reservation.tour_id
WHERE reservation.id IS NULL
GROUP BY client_data.name, reservation.id;

These both ways work and show me clients that have no reservations IN GENERAL but I also need to show clients from certain year.

When I try to include

WHERE tour.start_date BETWEEN '2022-01-01' AND '2022-12-31'

the SQL statement returns 0 rows.

Any ideas how to do this?

EDIT: I'll add full data and schema i work with.

schema: https://pastebin.com/ETvrW1tQ

data: https://pastebin.com/h1WHT0zZ

CodePudding user response:

You've gotten it almost right. The reason why WHERE tour.start_date BETWEEN '2022-01-01' AND '2022-12-31' returns 0 rows is because it filters out all those clients who didn't make a reservation in that period as WHERE is applied to whole result set. So, instead of adding the date condition in the WHERE clause, I'd suggest adding it in the join condition for tour. Moreover I believe an OUTER JOIN wouldn't be required here either as you just want all the clients so, a LEFT JOIN should be sufficient. I think the following should work:

SELECT client_data.name, reservation.id, COUNT(reservation.id)
FROM client_data
LEFT JOIN reservation ON client_data.id = reservation.client_id
LEFT JOIN tour ON tour.id = reservation.tour_id and tour.start_date BETWEEN '2022-01-01' AND '2022-12-31'
WHERE reservation.id IS NULL
GROUP BY client_data.name, reservation.id;

Hope it helps

Edit

As OP mentioned the above query doesn't work as intended, I think we'll have to resort to using a subquery (or cte) here which I previously wanted to avoid due to performance reasons but maybe we're getting too ahead of ourselves on that. It's possible we can avoid it but I can't think of the correct way at the moment so here's a solution with subquery that will hopefully work.

select * from client_data where id not in (
    select distinct client_id from reservation r 
    join tour t on r.tour_id = t.id
    where t.start_date BETWEEN '2022-01-01' AND '2022-12-31'
);

In this we first find out the client_ids that did make a reservation in the said time frame and filter them out from the client data.

Have attached a fiddle in which you can play around it a bit

CodePudding user response:

This will return the tour id's in 2022 that do not have a corresponding tour id in reservation:

select id as tour_id
from tour
where start_date between '2022-01-01' and '2022-12-31'
except 
select tour_id 
from reservation;

But since TOUR does not have a client_id, then how would you expect to get the client_id or client_name?

  • Related