There is a table:
id | direction_id | created_at |
---|---|---|
1 | 2 | 22 November 2021 г., 16:00:00 |
2 | 2 | 22 November 2021 г., 16:20:00 |
43 | 2 | 22 November 2021 г., 16:25:00 |
455 | 1 | 22 November 2021 г., 16:27:00 |
6567 | 2 | 22 November 2021 г., 17:36:00 |
674556 | 2 | 22 November 2021 г., 20:01:00 |
5243554 | 1 | 22 November 2021 г., 20:50:00 |
5243554 | 1 | 22 November 2021 г., 21:46:00 |
I need to get the following result:
1 | 2 | created_at_by_hour |
---|---|---|
1 | 3 | 22.11.21 17 |
1 | 4 | 22.11.21 18 |
1 | 4 | 22.11.21 19 |
1 | 4 | 22.11.21 20 |
2 | 5 | 22.11.21 21 |
3 | 5 | 22.11.21 22 |
1 and 2 in the header are all possible values of direction_id
that are in the table.
created_at
is reduced to hours and you need to count how many records satisfy the condition <= created_at_by_hour
. But the grouping should be such that if the time (hour) when no records were created, then just duplicate the previous hour.
The table consists of three fields - id
(int), direction_id
(int), created_at
(timestamptz). I need to get an hourly (based on the created_at field) data upload with the number of records created before this "grouped" time. But I need not just the number, but separately for each direction_id
(there are only two of them - 1
and 2
). If no records were created for a certain direction_id
at a certain hour, duplicate the previous one, but the result should end at the last created_at
. created_at
is the time when the record was created.
CodePudding user response:
In my opinion, better to generate a date between min and max date according to an hour then calculate the count of each direction.
with time_range as (
select
min(created_at) interval '1 hour' as min,
max(created_at) interval '1 hour' as max
from test
)
select
count(*) filter (where direction_id = 1) as "1",
count(*) filter (where direction_id = 2) as "2",
to_char(gs.hour, 'dd.mm.yy HH24') as created_at_by_hour
from
test t
cross join time_range tr
inner join generate_series(tr.min, tr.max, interval '1 hour') gs(hour)
on t.created_at <= gs.hour
group by gs.hour
order by gs.hour
CodePudding user response:
Truncate the date down to the hour, group by it and count. Then use SUM OVER
to get a running total of the counts. In order to show missing hours in the table, you must generate a series of hours and outer join your data.
with hourly as
(
select date_trunc('hour', created_at) as hour, direction_id from mytable
)
, hours(hour) as
(
select *
from generate_series
(
(select min(hour) from hourly), (select max(hour) from hourly), interval '1 hour'
)
)
select
hours.hour,
sum(count(*) filter (where hourly.direction_id = 1)) over (order by hour) as "1",
sum(count(*) filter (where hourly.direction_id = 2)) over (order by hour) as "2"
from hours
left join hourly using (hour)
group by hour
order by hour;
Demo: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=21d0c838452a09feac4ebc57906829f4