I have a table with multiple equal date entries and a value. I need a table that calculates the historical value and the count of entries per date. I want to use the data to create some charts in gnuplot/etc later.
Raw data:
date | value
------------ ------
2017-11-26 | 5
2017-11-26 | 5
2017-11-26 | 5
2017-11-28 | 20
2017-11-28 | 5
2018-01-07 | 200
2018-01-07 | 5
2018-01-07 | 20
2018-01-15 | 5
2018-01-16 | 50
Output should be:
date | avg | count manual calc explanation
------------ -------- ------- ---------------------------------------
2017-11-26 | 5 | 3 (5 5 5) / 3 = 5
2017-11-28 | 8 | 2 (5 5 5 20 5) / 5 = 8
2018-01-07 | 33.125 | 3 (5 5 5 20 5 200 5 20) / 8 = 33.125
2018-01-15 | 30 | 1 (5 5 5 20 5 200 5 20 5) / 9 = 30
2018-01-16 | 32 | 1 (5 5 5 20 5 200 5 20 5 50) / 10 = 32
If it is not possible to calculate two different columns, I would be fine for the avg column. For counting only the dates I have the solution "SELECT DISTINCT date, COUNT(date) FROM table_name GROUP BY date ORDER BY date"
I played around with DISTINCTs, GROUP BYs, JOINs, etc, but I did not find any solution. I found some other articles on the web, but no one covers a case where a date is more than once listed in the table.
CodePudding user response:
You want a running average (total value divided by total count up to the row). This is done with window functions.
select
date,
sum(sum_value) over (order by date) as running_sum,
sum(cnt) over (order by date) as running_count,
sum(sum_value) over (order by date) /
sum(cnt) over (order by date) as running_average
from
(
select date, sum(value) as sum_value, count(*) as cnt
from mytable
group by date
) aggregated
order by date;
Demo: https://dbfiddle.uk/?rdbms=postgres_13&fiddle=fb13b63970cb096913a53075b8b5c8d7