I'm trying to figure out how to run a median/percentile aggregation over some jsonb, grouped by a truncated date (which is currently an epoch) e.g. by day/month/year.
CREATE TABLE datatbl (id serial primary key, json_data jsonb, timestamp_epoch bigint);
INSERT INTO datatbl (json_data, timestamp_epoch) VALUES
('{"value": 1.2}'::jsonb,1638985100),
('{"value": 5.3}'::jsonb,1636393100),
('{"value": 2.2}'::jsonb,1638985100),
('{"value": 8.4}'::jsonb,1633714700),
('{"value": 4.2}'::jsonb,1636393100),
('{"value": 5.4}'::jsonb,1636393100),
('{"value": 1.4}'::jsonb,1638985100),
('{"value": 9.8}'::jsonb,1638985100),
('{"value": 3.3}'::jsonb,1633714700),
('{"value": 1.9}'::jsonb,1636393100);
I've figured out how to truncate the epoch and group the results by month
SELECT
DATE_TRUNC('month', to_timestamp (timestamp_epoch)) AS month, json_data->>'value' AS val
FROM datatbl
GROUP BY month, val
ORDER BY month ASC;
But can't figure out how to run percentile_disc(0.5) to get the median of each group.
Any help appreciated!!
CodePudding user response:
select
to_char(to_timestamp (timestamp_epoch), 'yyyy-mm') as month_period,
avg((json_data->>'value')::numeric) as average,
percentile_cont(0.5) within group (order by (json_data->>'value')::numeric) as median
from datatbl
group by month_period;
month_period | average | median |
---|---|---|
2021-10 | 5.85 | 5.85 |
2021-11 | 4.20 | 4.75 |
2021-12 | 3.65 | 1.80 |