Thanks to the help on my last question, I now have one final step to fully grasp this method of querying for my problem.
Given these two tables:
airlines_detail
:
airplane_id | airline_id | total_seats |
---|---|---|
2187 | 425 | 260 |
4361 | 747 | 290 |
3478 | 747 | 270 |
7292 | 425 | 250 |
5833 | 425 | 310 |
3472 | 425 | 300 |
4472 | 747 | 290 |
2624 | 425 | 320 |
Bookings
:
airplane_id | booked |
---|---|
2187 | 40 |
4361 | 30 |
4361 | 10 |
5833 | 30 |
4361 | 30 |
2624 | 30 |
4472 | 40 |
4472 | 40 |
2624 | 10 |
7292 | 20 |
2187 | 20 |
4472 | 30 |
3478 | 20 |
3472 | 40 |
4472 | 30 |
4472 | 10 |
4361 | 20 |
3478 | 30 |
2187 | 30 |
2187 | 10 |
I need to get this result:
airline_id | airplanes |
---|---|
425 | 7292 |
747 | 4361 |
The result is the airplane_id
whose SUM
of empty seats is closest to the AVG
number of empty seats per each airline_id
(in this case, there's only 425 and 747 to worry about).
So, we would need to find the AVG
of SUM(empty_seats) GROUP BY airline_id
, then compare the SUM(empty_seats) GROUP BY airplane_id
for each airline, and return a table that shows the airplanes that are closest to their respective airline's average of empty seats.
For airline with id 425 the average number of empty seats are 242. Therefore the airplane which is closest to this average is 7292.
Equipped with the answer from the previous question in getting the sum of empty seats for each plane, I'm hoping this next step is a simple one for a guru.
CodePudding user response:
You would start with the query you have, then use that as an intermediate result to get the averages per airline, and then rank the airplanes by how much their empty seat count deviates from their airline's average. For that ranking you can use ROW_NUMBER()
and a windowing clause (OVER
).
Finally, select the rows in that result that are ranked first.
For this chain of building upon previous results, it is handy to use the WITH
syntax:
WITH base as (
SELECT
a.airplane_id,
a.airline_id,
a.total_seats - SUM(b.booked) AS empty_seats
FROM
airlines_detail a
LEFT JOIN bookings b
ON
a.airplane_id = b.airplane_id
GROUP BY
a.airplane_id,
a.airline_id
), average as (
SELECT
airline_id,
AVG(empty_seats) avg_empty_seats
FROM
base
GROUP BY
airline_id
), ranked as (
SELECT
b.airplane_id,
b.airline_id,
ROW_NUMBER() OVER (
PARTITION BY b.airline_id
ORDER BY ABS(b.empty_seats - a.avg_empty_seats)
) rn
FROM
base b
INNER JOIN average a
ON
a.airline_id = b.airline_id
)
SELECT
airline_id,
airplane_id
FROM
ranked
WHERE
rn = 1;