Home > Blockchain >  How to calculate P50, P90 values without window functions?
How to calculate P50, P90 values without window functions?

Time:08-19

I have a table like:

id    category    value1    value2    value3
1     1           100       324       940
1     1           222       404       1000
1     1           333       304       293
1     2           490       490       400
1     2           140       400       499
1     3           400       400       103
1     3           300       123       124

I need to calculate the P50 and P90 values for each (id, category) combination. My original approach:

SELECT DISTINCT
    id
  , category
  , PERCENTILE_CONT(value1, 0.5) OVER (PARTITION BY id, category) AS p50_value1
  , PERCENTILE_CONT(value1, 0.5) OVER (PARTITION BY id, category) AS p50_value2
  , PERCENTILE_CONT(value1, 0.5) OVER (PARTITION BY id, category) AS p50_value3
  , PERCENTILE_CONT(value1, 0.9) OVER (PARTITION BY id, category) AS p90_value1
  , PERCENTILE_CONT(value1, 0.9) OVER (PARTITION BY id, category) AS p90_value2
  , PERCENTILE_CONT(value1, 0.9) OVER (PARTITION BY id, category) AS p90_value3
FROM my_table;

I'm using the above code in a longer query but it's resulting in this error, so I was hoping to better optimize it: Resources exceeded during query execution. The query could not be executed in the allotted memory. Peak usage: 133% of limit. Top memory consumer: OVER() clauses: 100%.

Is there any way to GROUP BY id, category here instead?

Alternatively, would the following method be faster or is it just a cleaner way of writing the same query?

SELECT DISTINCT
    id
  , category
  , PERCENTILE_CONT(value1, 0.5) OVER w AS p50_value1
  , PERCENTILE_CONT(value1, 0.5) OVER w AS p50_value2
  , PERCENTILE_CONT(value1, 0.5) OVER w AS p50_value3
  , PERCENTILE_CONT(value1, 0.9) OVER w AS p90_value1
  , PERCENTILE_CONT(value1, 0.9) OVER w AS p90_value2
  , PERCENTILE_CONT(value1, 0.9) OVER w AS p90_value3
FROM my_table
WINDOW w AS (PARTITION BY id, category);

CodePudding user response:

You can use approx_quantiles aggregation function.

SELECT
    id,
    category,
    approx_quantiles(value1, 10)[offset(5)] as p50,
    approx_quantiles(value1, 10)[offset(9)] as p90
FROM my_table
GROUP BY 1,2
  • Related