Home > database >  Running calculations on different percent amounts of data - SQL
Running calculations on different percent amounts of data - SQL

Time:04-14

I'm brainstorming on ways to find trends over a dataset containing transaction amounts that spans a year.

I'd like to run an average of top 25% observations of data and bottom 75% observations of data and viceversa.

If the entire dataset contains 1000 observations, I'd like to run:

  1. An average of the top 25% and then separately, an average of the bottom 75% and find the resulting average of this.
  2. Inversely, top 75% average, then bottom 25%, then the average of the 2.

For the overall average I have: avg(transaction_amount)

I am aware that in order for the sectioning averages to be useful, I will have to order the data according to the date which I already have accounted for in my SQL code:

select avg(transaction_amount)
from example.table
order by transaction_date

I am now struggling to find a way to split the data between 25% and 75% based on the number of observations.

Thanks.

CodePudding user response:

If you're using MSSQL, it's pretty trivial depending on exactly the output you're looking for.

SELECT TOP 25 PERCENT
  *
FROM (
   SELECT 
     AVG(transaction_amount) as avg_amt
   FROM example.table
) AS sub
ORDER BY sub.avg_amt DESC

CodePudding user response:

Use PERCENT_RANK in order to see which percentage block a row belongs to. Then use this to group your data:

with data as
(
  select t.*, percent_rank() over (order by transaction_amount) as pr
  from example.table t
)
select
  case when pr <= 0.75 then '0-75%' else '75-100%' end as percent,
  avg(transaction_amount) as avg,
  avg(avg(transaction_amount)) over () as avg_of_avg
from data
group by case when pr <= 0.75 then '0-75%' else '75-100%' end
union all
select
  case when pr <= 0.25 then '0-25%' else '25-100%' end as percent,
  avg(transaction_amount) as avg,
  avg(avg(transaction_amount)) over () as avg_of_avg
from data
case when pr <= 0.25 then '0-25%' else '25-100%' end;
  • Related