Home > Back-end >  How can I aggregate columns, then find out the biggest changes over time and display the most outsta
How can I aggregate columns, then find out the biggest changes over time and display the most outsta

Time:04-20

I'm a beginner in SQL and want to track the biggest changes in sales of tickets over time in absolute manner (so positive and negative) for a set of states (with a number of venders who can sell in every state they want) where the tickets are sold.

So I have a state_table like this:

state_key state_name
1 Alabama
2 Connecticut
3 Maine

I have a ticket_sales table like this:

vender_key state_key sell_date no_of_sales
1 1 2021-01-01 27
1 2 2021-01-01 48
1 3 2021-01-01 69
2 1 2021-01-01 31
2 2 2021-01-01 41
3 2 2021-01-01 10
3 3 2021-01-01 15
1 1 2021-02-01 11
1 2 2021-02-01 21
1 3 2021-02-01 20
2 1 2021-02-01 67
2 2 2021-02-01 39
3 2 2021-02-01 13
3 3 2021-02-01 19
1 1 2021-03-01 77
1 2 2021-03-01 68
1 3 2021-03-01 59
2 1 2021-03-01 10
2 2 2021-03-01 11
3 2 2021-03-01 49
3 3 2021-03-01 54

Of course, the tables are much longer in real.

Now I want to know which states have the largest absolute difference in total ticket sales within the whole time frame. I want to display

  • the name of the state
  • the maximum date
  • the number of sales at the maximum date in the respective state
  • the minimum date
  • the number of sales at the minimum date in the respective state and
  • the difference between the maximum sales and the minimum sales for the top 3 states;

sorted descending by the difference between max and min sales per state.

My goal is to see which are the states with large changes in ticket sales over time, regardless if there is a positive or a negative development of sales.

So therefore, I need to sum up the venders' sales per state for each date first, then to pick the highest number of sales for each state and the corresponding date for each state. Then I have to do the same for the lowest number of sales and the corresponding date for each state. And at last, I need to calculate the absolute difference between these two numbers.

For me, this is a bit too much aggregation to handle in SQL, unfortunately. I have tried some code with SUM and GROUP BY states, but I don't know how to tell SQL the aggregation and then calculation correctly by choosing highest numbers per state (so the combination of MAX respectively MIN and GROUP BY), I'm confused by that.

My expected result should be:

state_name date_max sales_max date_min sales_min difference
Maine 2021-03-01 113 2021-02-01 39 74
Connecticut 2021-03-01 128 2021-02-01 73 55
Alabama 2021-03-01 87 2021-01-01 58 29

Thank you in advance!

CodePudding user response:

You can do it with two aggregations. First find dates with max/min sum of no_of_sales for every state_key next aggregate it by state_key conditionally.

select state_key 
  , max(case nmin when 1 then sell_date end) date_min
  , max(case nmin when 1 then n end) sales_min
  , max(case nmax when 1 then sell_date end) date_max
  , max(case nmax when 1 then n end) sales_max
  , max(case nmax when 1 then n end) - max(case nmin when 1 then n end ) delta
from ( 
    select state_key, sell_date, sum(no_of_sales) n
       , row_number() over(partition by state_key order by sum(no_of_sales)) nmin
       , row_number() over(partition by state_key order by sum(no_of_sales) desc) nmax
    from ticket_sales
    group by state_key, sell_date
) t
where nmin = 1 or nmax = 1
group by state_key
  • Related