I'm trying to summarize some statics properties from several numeral columns such as continuous quantiles, mean , standard deviation, etc., then wrap them into rows with original column name attached as an additional column. I know about using AVG
, STDDEV_PO
, PERCENTILE_CONT
... to get them from single column, but didn't find article/recipe about doing them on several columns at once. Any ideas?
Example input:
ID | Col1 | Col2 | Col3 |
---|---|---|---|
1 | 1.0 | 2.0 | 4.0 |
2 | 2.0 | 4.0 | 8.0 |
3 | 3.0 | 6.0 | 12.0 |
4 | 4.0 | 8.0 | 16.0 |
Expected output:
Col Name | Q1 | Q2 | Q3 | Mean | Std |
---|---|---|---|---|---|
Col1 | 1.75 | 2.5 | 3.25 | 2.5 | 1.12 |
Col2 | 3.5 | 5.0 | 6.5 | 5.0 | 2.24 |
Col3 | 7.0 | 10.0 | 13.0 | 10.0 | 4.47 |
Or 'transposed' version:
Stats | Col1 | Col2 | Col3 |
---|---|---|---|
Q1 | 1.75 | 3.5 | 7.0 |
Q2 | 2.5 | 5.0 | 10.0 |
Q3 | 3.25 | 6.5 | 13.0 |
Mean | 2.5 | 5.0 | 10.0 |
Std | 1.12 | 2.24 | 4.47 |
CodePudding user response:
Consider below
select distinct col,
percentile_cont(value, 0.25) over win as q1,
percentile_cont(value, 0.50) over win as q2,
percentile_cont(value, 0.75) over win as q3,
avg(value) over win as avg,
stddev_pop(value) over win as std,
from your_table
unpivot (value for col in (col1, col2, col3))
window win as (partition by col)
if applied to sample data in your question - output is
To get the 'transposed' version - use below
select * from (
select * from (
select distinct col,
percentile_cont(value, 0.25) over win as q1,
percentile_cont(value, 0.50) over win as q2,
percentile_cont(value, 0.75) over win as q3,
avg(value) over win as avg,
stddev_pop(value) over win as std,
from data
unpivot (value for col in (col1, col2, col3))
window win as (partition by col)
) unpivot (value for stats in (q1, q2, q3, avg, std))
) pivot (any_value(value) for col in ('col1', 'col2', 'col3'))
in this case - output is