Home > database >  How to use `ROW_NUMBER` but grouping by days in PostgreSQL?
How to use `ROW_NUMBER` but grouping by days in PostgreSQL?

Time:02-16

I have a table (t1) like this:

    date    | period |        price         | consumptionKWh 
------------ -------- ---------------------- ----------------
 2021-12-10 | valle  |             0.202981 |          0.108
 2021-12-10 | valle  |  0.19792099999999999 |          0.107
 2021-12-10 | valle  |  0.15062099999999998 |          0.244
 2021-12-10 | valle  |  0.12082100000000001 |          0.274
 2021-12-10 | valle  |             0.119051 |          0.211
 2021-12-10 | valle  |             0.147921 |          0.132
 2021-12-10 | valle  |  0.20152099999999998 |          0.107
 2021-12-10 | llano  |              0.25685 |          0.108
 2021-12-10 | llano  |  0.28739000000000003 |          0.142
 2021-12-10 | punta  |             0.299161 |           0.45
 2021-12-10 | punta  |             0.298261 |          0.236
 2021-12-10 | punta  |             0.285001 |          0.233
 2021-12-10 | punta  |             0.267851 |           0.21
 2021-12-10 | llano  |              0.25655 |          0.151
 2021-12-10 | llano  |              0.25645 |          0.111
 2021-12-10 | llano  |              0.25685 |          0.167
 2021-12-10 | llano  |               0.2775 |          0.338
 2021-12-10 | punta  |  0.29957100000000003 |          0.376
 2021-12-10 | punta  |  0.30451100000000003 |          0.159
 2021-12-10 | punta  |  0.31151100000000004 |          0.642
 2021-12-10 | punta  |  0.31367100000000003 |          0.226
 2021-12-10 | llano  |  0.28506000000000004 |          0.354
 2021-12-10 | llano  |              0.27788 |          0.446
 2021-12-10 | valle  |             0.237921 |          0.313
 2021-12-11 | valle  |             0.242031 |          0.378
 2021-12-11 | valle  |             0.234451 |          0.216
 2021-12-11 | valle  |             0.228951 |          0.261
 2021-12-11 | valle  |             0.222921 |          0.186
 2021-12-11 | valle  |             0.222521 |          0.213
 ...

(and so on)

What I want to do is to get a new autoincrement column based on day index, starting by 1.

I guess I could achieve that by using something like ROW_NUMBER:

SELECT
    ROW_NUMBER() OVER (PARTITION BY EXTRACT(day FROM date) ORDER BY date) AS day,
    date,
    period,
    SUM("consumptionKWh" * price) AS total_eur,
    SUM("consumptionKWh") AS total_kwh
FROM t1
GROUP BY 2, 3
ORDER BY date ASC

But with this query I get the following:

 day |    date    | period |      total_eur      |     total_kwh      
----- ------------ -------- --------------------- --------------------
   1 | 2021-12-10 | punta  |         0.759601642 |              2.532
   2 | 2021-12-10 | llano  | 0.49728885000000006 |              1.817
   3 | 2021-12-10 | valle  |         0.253633326 |              1.496
   1 | 2021-12-11 | valle  |         1.225456225 |              4.675
   1 | 2021-12-12 | valle  |         3.738124907 | 12.536999999999999
   1 | 2021-12-13 | valle  |         2.722091121 | 10.550999999999998
   2 | 2021-12-13 | llano  |  1.4012282400000002 |              4.377
   3 | 2021-12-13 | punta  |  2.2320555440000005 |              6.234
   1 | 2021-12-14 | valle  |         0.880286341 |              2.981
   2 | 2021-12-14 | llano  |  0.9380788800000001 |              2.819
   3 | 2021-12-14 | punta  |         1.070563579 |              2.969
   1 | 2021-12-15 | punta  |  1.1036628030000002 | 3.0730000000000004
   2 | 2021-12-15 | llano  |  0.7038874300000001 |              2.102
   3 | 2021-12-15 | valle  |  0.9538837029999999 |              3.143
   1 | 2021-12-16 | punta  |         1.335933163 |              3.473
   2 | 2021-12-16 | valle  |  0.5342087160000001 | 1.7959999999999998
   3 | 2021-12-16 | llano  |  1.0529427400000002 |              3.031
   1 | 2021-12-17 | punta  |  0.6925185020000001 | 1.7719999999999998
   2 | 2021-12-17 | llano  |          0.76633146 |              2.122
   3 | 2021-12-17 | valle  |  0.8851822280000001 |              2.978
   1 | 2021-12-18 | valle  |         3.077495737 |  9.177000000000001
   1 | 2021-12-19 | valle  |         2.542623128 |  7.438000000000001
   1 | 2021-12-20 | valle  |  2.0395871780000006 | 6.4079999999999995
   2 | 2021-12-20 | punta  |         2.950098139 |  7.119000000000001
   3 | 2021-12-20 | llano  |  2.9513837600000006 |              7.362
   1 | 2021-12-21 | punta  | 0.46440593500000005 |              1.205
   2 | 2021-12-21 | llano  |  0.6545229200000001 |               1.77
   3 | 2021-12-21 | valle  |         0.614052971 | 1.8709999999999998
   1 | 2021-12-22 | punta  |         1.398044112 |              3.262
   2 | 2021-12-22 | valle  |         0.513314852 | 1.4420000000000002
   3 | 2021-12-22 | llano  |          1.84430131 |               4.55
   1 | 2021-12-23 | punta  |  1.0141563310000001 |              2.221
   2 | 2021-12-23 | llano  |  1.5145753900000003 | 3.4400000000000004
   3 | 2021-12-23 | valle  |  2.3418660350000007 | 6.0649999999999995
   1 | 2021-12-24 | llano  |  2.1900586200000003 |              6.282
   2 | 2021-12-24 | valle  |         0.724173297 | 2.2670000000000003
   3 | 2021-12-24 | punta  |         2.217604709 |              5.779
   1 | 2021-12-25 | valle  |         2.654226093 | 10.322999999999999
   1 | 2021-12-26 | valle  |         1.151362708 |              5.228
   1 | 2021-12-27 | llano  | 0.34642995000000004 |              2.307
   2 | 2021-12-27 | punta  | 0.37230016499999996 |              1.715
   3 | 2021-12-27 | valle  | 0.18087813500000005 |              4.235
   1 | 2021-12-28 | llano  |  0.8461612799999999 |              5.217
   2 | 2021-12-28 | punta  |  0.5383758769999999 |              2.567
   3 | 2021-12-28 | valle  |         0.251490035 |              4.445
   1 | 2021-12-29 | punta  |         0.678126537 |              2.497
   2 | 2021-12-29 | llano  |          0.42601369 | 1.6949999999999998
   3 | 2021-12-29 | valle  |         0.735734432 |              5.572
   1 | 2021-12-30 | valle  |  0.6012562189999998 |              2.809
   2 | 2021-12-30 | punta  | 0.42152694300000004 | 1.5630000000000002
   3 | 2021-12-30 | llano  |          0.30368189 |              1.257
   1 | 2021-12-31 | llano  |          1.15330524 |  6.236000000000001
   2 | 2021-12-31 | punta  |  2.1666425439999997 |  9.484000000000002
   3 | 2021-12-31 | valle  |         0.706290842 |  4.922000000000001

Which is in a sense something similar to what I'm looking for, but just in the opposite direction. That is, instead of increasing the index in each partition (for each day), what I want is that it remains constant, and increases only as the day increases. What I would like to obtain as a result would be the following.

  day |    date    | period |      total_eur      |     total_kwh      
----- ------------ -------- --------------------- --------------------
   1 | 2021-12-10 | punta  |         0.759601642 |              2.532
   1 | 2021-12-10 | llano  | 0.49728885000000006 |              1.817
   1 | 2021-12-10 | valle  |         0.253633326 |              1.496
   2 | 2021-12-11 | valle  |         1.225456225 |              4.675
   3 | 2021-12-12 | valle  |         3.738124907 | 12.536999999999999
   4 | 2021-12-13 | valle  |         2.722091121 | 10.550999999999998
   4 | 2021-12-13 | llano  |  1.4012282400000002 |              4.377
   4 | 2021-12-13 | punta  |  2.2320555440000005 |              6.234
   5 | 2021-12-14 | valle  |         0.880286341 |              2.981
   5 | 2021-12-14 | llano  |  0.9380788800000001 |              2.819
   5 | 2021-12-14 | punta  |         1.070563579 |              2.969
   6 | 2021-12-15 | punta  |  1.1036628030000002 | 3.0730000000000004
   6 | 2021-12-15 | llano  |  0.7038874300000001 |              2.102
   6 | 2021-12-15 | valle  |  0.9538837029999999 |              3.143
   7 | 2021-12-16 | punta  |         1.335933163 |              3.473
   7 | 2021-12-16 | valle  |  0.5342087160000001 | 1.7959999999999998
   7 | 2021-12-16 | llano  |  1.0529427400000002 |              3.031
   8 | 2021-12-17 | punta  |  0.6925185020000001 | 1.7719999999999998
   8 | 2021-12-17 | llano  |          0.76633146 |              2.122
   8 | 2021-12-17 | valle  |  0.8851822280000001 |              2.978
   9 | 2021-12-18 | valle  |         3.077495737 |  9.177000000000001
  10 | 2021-12-19 | valle  |         2.542623128 |  7.438000000000001
  11 | 2021-12-20 | valle  |  2.0395871780000006 | 6.4079999999999995
  11 | 2021-12-20 | punta  |         2.950098139 |  7.119000000000001
  11 | 2021-12-20 | llano  |  2.9513837600000006 |              7.362
  ...

CodePudding user response:

You can do it by using CTE. You set the row number by date then join to t1 table.

with x as (select date,ROW_NUMBER() OVER (order by date) AS day from t1 group by date)
SELECT
    x.day,
    t1.date,
    period,
    SUM(consumptionKWh * price) AS total_eur,
    SUM(consumptionKWh) AS total_kwh
FROM t1,x
where x.date = t1.date
GROUP BY 1,2, 3
ORDER BY t1.date ASC;

Result here

CodePudding user response:

You need dense_rank().
https://www.postgresql.org/docs/current/functions-window.html

dense_rank () → bigint

Returns the rank of the current row, without gaps; this function effectively counts peer groups.

  • Related