Home > Software design >  Running a median aggregate over jsonb and grouping by day/month/year etc
Running a median aggregate over jsonb and grouping by day/month/year etc

Time:12-09

I'm trying to figure out how to run a median/percentile aggregation over some jsonb, grouped by a truncated date (which is currently an epoch) e.g. by day/month/year.

CREATE TABLE datatbl (id serial primary key, json_data jsonb, timestamp_epoch bigint);

INSERT INTO datatbl (json_data, timestamp_epoch) VALUES 
    ('{"value": 1.2}'::jsonb,1638985100),
    ('{"value": 5.3}'::jsonb,1636393100),
    ('{"value": 2.2}'::jsonb,1638985100),
    ('{"value": 8.4}'::jsonb,1633714700),
    ('{"value": 4.2}'::jsonb,1636393100),
    ('{"value": 5.4}'::jsonb,1636393100),
    ('{"value": 1.4}'::jsonb,1638985100),
    ('{"value": 9.8}'::jsonb,1638985100),
    ('{"value": 3.3}'::jsonb,1633714700),
    ('{"value": 1.9}'::jsonb,1636393100);

I've figured out how to truncate the epoch and group the results by month

SELECT
    DATE_TRUNC('month', to_timestamp (timestamp_epoch)) AS month, json_data->>'value' AS val
FROM datatbl 
GROUP BY month, val
ORDER BY month ASC;

But can't figure out how to run percentile_disc(0.5) to get the median of each group.

Any help appreciated!!

CodePudding user response:

select
    to_char(to_timestamp (timestamp_epoch), 'yyyy-mm') as month_period,
    avg((json_data->>'value')::numeric) as average,
    percentile_cont(0.5) within group (order by (json_data->>'value')::numeric) as median
 from datatbl 
 group by month_period;
month_period average median
2021-10 5.85 5.85
2021-11 4.20 4.75
2021-12 3.65 1.80
  • Related