Home > Enterprise >  Guidance on getting desired SQL results
Guidance on getting desired SQL results

Time:09-01

Thanks to the help on my last question, I now have one final step to fully grasp this method of querying for my problem.

Given these two tables:

airlines_detail:

airplane_id airline_id total_seats
2187 425 260
4361 747 290
3478 747 270
7292 425 250
5833 425 310
3472 425 300
4472 747 290
2624 425 320

Bookings:

airplane_id booked
2187 40
4361 30
4361 10
5833 30
4361 30
2624 30
4472 40
4472 40
2624 10
7292 20
2187 20
4472 30
3478 20
3472 40
4472 30
4472 10
4361 20
3478 30
2187 30
2187 10

I need to get this result:

airline_id airplanes
425 7292
747 4361

The result is the airplane_id whose SUM of empty seats is closest to the AVG number of empty seats per each airline_id (in this case, there's only 425 and 747 to worry about).

So, we would need to find the AVG of SUM(empty_seats) GROUP BY airline_id, then compare the SUM(empty_seats) GROUP BY airplane_id for each airline, and return a table that shows the airplanes that are closest to their respective airline's average of empty seats.

For airline with id 425 the average number of empty seats are 242. Therefore the airplane which is closest to this average is 7292.

Equipped with the answer from the previous question in getting the sum of empty seats for each plane, I'm hoping this next step is a simple one for a guru.

CodePudding user response:

You would start with the query you have, then use that as an intermediate result to get the averages per airline, and then rank the airplanes by how much their empty seat count deviates from their airline's average. For that ranking you can use ROW_NUMBER() and a windowing clause (OVER). Finally, select the rows in that result that are ranked first.

For this chain of building upon previous results, it is handy to use the WITH syntax:

WITH base as (
    SELECT
        a.airplane_id,
        a.airline_id,
        a.total_seats - SUM(b.booked) AS empty_seats
    FROM
        airlines_detail a
    LEFT JOIN bookings b
        ON
            a.airplane_id = b.airplane_id
    GROUP BY  
        a.airplane_id,
        a.airline_id
), average as (
    SELECT
        airline_id,
        AVG(empty_seats) avg_empty_seats
    FROM
        base
    GROUP BY
        airline_id
), ranked as (
    SELECT
        b.airplane_id,
        b.airline_id,
        ROW_NUMBER() OVER (
            PARTITION BY b.airline_id 
            ORDER BY ABS(b.empty_seats - a.avg_empty_seats)
        ) rn
    FROM
        base b
    INNER JOIN average a
        ON
            a.airline_id = b.airline_id
)
SELECT
    airline_id,
    airplane_id
FROM
    ranked
WHERE
    rn = 1;
  •  Tags:  
  • sql
  • Related