Home > OS >  How to get 24 hour data as individual columns in Postgresql?
How to get 24 hour data as individual columns in Postgresql?

Time:09-15

I have a table net_score that looks like below (only a sample provided)

            date            |       ip        | up_score 
---------------------------- ----------------- ----------
 2022-09-09 07:30:04.485979 | 12.22.19.0      |       51
 2022-09-09 07:30:04.485979 | 10.22.39.1      |       95
 2022-09-09 07:30:04.485979 | 14.260.13.1     |      100
 2022-09-09 07:30:04.485979 | 252.229.219.43  |       97
 2022-09-09 07:30:04.485979 | 10.551.343.10   |       97
 2022-09-09 08:30:04.485979 | 12.22.19.0      |       11
 2022-09-09 08:30:04.485979 | 10.22.39.1      |       54
 2022-09-09 08:30:04.485979 | 14.260.13.1     |       89
 2022-09-09 08:30:04.485979 | 252.229.219.43  |       37
 2022-09-09 08:30:04.485979 | 10.551.343.10   |       11
 2022-09-09 09:30:04.485979 | 12.22.19.0      |       54
 2022-09-09 09:30:04.485979 | 10.22.39.1      |       15
 2022-09-09 09:30:04.485979 | 14.260.13.1     |       90
 2022-09-09 09:30:04.485979 | 252.229.219.43  |       17
 2022-09-09 09:30:04.485979 | 10.551.343.10   |       50

As you can see I have a date, ip and up_score column and the data is calculated at every hour.

What I want is a way to get hourly data for 24 hours as 24 separate columns per ip and incase the data doesn't exists, simply put -1 for that column.

I can get hourly data by doing something like below

select date_trunc('hour', date) as hourly, ip, up_score 
from net_score 
where date between '2022-09-09 05:30:00' and '2022-09-10 05:30:00' 
group by ip, hourly, up_score 
order by ip, hourly

But my expected query should produce a table output like below (and fill missing hours value with -1)

       ip       | hour_0  |  hour_1  |  hour_2  |  ..
-------------------------- ---------  ---------- ------
12.22.19.0      |   51    |    11    |     54   |  ..
10.22.39.1      |   95    |    54    |     15   |  ..
14.260.13.1     |   100   |    89    |     90   |  ..
252.229.219.43  |   97    |    37    |     17   |  ..
10.551.343.10   |   97    |    11    |     50   |  ..

How can I do this?

Note:

The reasoning behind doing it this way is that getting the output by using the query I know will give back lot of rows and for every new ip that gets added, the output rows by increase by another 24. Taking this result and processing it (I am using python in my code) will take longer as I deal with more and more ips.

But if I try it the way I mentioned (24 columns way), I will only deal with one additional row per ip and the number of columns continues to be 24 only. Hence I am thinking of doing it this way. Please correct me if my approach is wrong or can be improved.

CodePudding user response:

You can use PIVOT technique in next way:

select 
    date_trunc('day', date) as daily, ip, 
    coalesce(sum(up_score) filter (where extract('hour' from date) = 6 ), 0) up_score_6,
    sum(up_score) filter (where extract('hour' from date) = 7 ) up_score_7,
    sum(up_score) filter (where extract('hour' from date) = 8 ) up_score_8,
    sum(up_score) filter (where extract('hour' from date) = 9 ) up_score_9
from net_score 
where date between '2022-09-09 05:30:00' and '2022-09-10 05:30:00' 
group by ip, daily
order by ip, daily

Of course you need expand my example query for each of 24 hours

online SQL editor

 ===================== ================ ============ ============ ============ ============ 
| daily               | ip             | up_score_6 | up_score_7 | up_score_8 | up_score_9 |
 ===================== ================ ============ ============ ============ ============ 
| 2022-09-09 00:00:00 | 10.22.39.1     | 0          | 95         | 54         | 15         |
 --------------------- ---------------- ------------ ------------ ------------ ------------ 
| 2022-09-09 00:00:00 | 10.551.343.10  | 0          | 97         | 11         | 50         |
 --------------------- ---------------- ------------ ------------ ------------ ------------ 
| 2022-09-09 00:00:00 | 12.22.19.0     | 0          | 51         | 11         | 54         |
 --------------------- ---------------- ------------ ------------ ------------ ------------ 
| 2022-09-09 00:00:00 | 14.260.13.1    | 0          | 100        | 89         | 90         |
 --------------------- ---------------- ------------ ------------ ------------ ------------ 
| 2022-09-09 00:00:00 | 252.229.219.43 | 0          | 97         | 37         | 17         |
 --------------------- ---------------- ------------ ------------ ------------ ------------ 

When used old PostgreSQL version then CASE statemnt can be used:

select 
    date_trunc('day', date) as daily, ip, 
    sum(case when extract('hour' from date) = 6 then up_score else 0 end) up_score_6,
    sum(case when extract('hour' from date) = 7 then up_score else 0 end) up_score_7,
    sum(case when extract('hour' from date) = 8 then up_score else 0 end) up_score_8,
    sum(case when extract('hour' from date) = 9 then up_score else 0 end) up_score_9
from net_score 
where date between '2022-09-09 05:30:00' and '2022-09-10 05:30:00' 
group by ip, daily
order by ip, daily
  • Related