Home > Enterprise >  Add variable to count consecutive months
Add variable to count consecutive months

Time:11-16

I have a query in a Postgres DB that combines clients subscriptions.

I want to add a variable that's called "consecutive months" but I'm not sure how to do it in Postgres.

My raw table is something like this:

client product Date
1 Sub 2020-10-01
1 Sub 2020-11-01
2 Sub 2020-11-01
2 Sub 2020-12-01
1 Sub 2021-01-01
1 Sub 2021-02-01
2 Sub 2021-02-01

and I inteed to have something that counts the consecutive months originating something like this:

client product Date Consecutive_months
1 Sub 2020-10-01 1
1 Sub 2020-11-01 2
2 Sub 2020-11-01 1
2 Sub 2020-12-01 2
1 Sub 2021-01-01 1
1 Sub 2021-02-01 2
2 Sub 2021-02-01 1

Thank you for the help in advanced!

CodePudding user response:

Looks like you got yourself a Gaps-And-Islands type of problem.

The trick is to calculate some ranking based on connected dates per client.

Then a sequencial number can be calculated based on client and the rank.

select client, product, "Date"
, row_number() over (partition by client, daterank order by "Date") as Consecutive_months
from
(
  select "Date", client, product
  , dense_rank() over (partition by client order by "Date") 
      (DATE_PART('year', AGE(current_date, "Date"))*12   
       DATE_PART('month', AGE(current_date, "Date"))) daterank
from raw t
) q
order by "Date", client
client | product | Date       | consecutive_months
-----: | :------ | :--------- | -----------------:
     1 | Sub     | 2020-10-01 |                  1
     1 | Sub     | 2020-11-01 |                  2
     2 | Sub     | 2020-11-01 |                  1
     2 | Sub     | 2020-12-01 |                  2
     1 | Sub     | 2021-01-01 |                  1
     1 | Sub     | 2021-02-01 |                  2
     2 | Sub     | 2021-02-01 |                  1

db<>fiddle here

CodePudding user response:

Based on the tag OP apparently realizes this is a gaps and island problem. This query extracts month and year information to generate a sequence that increments on a monthly basis. After that it's only necessary to use the standard difference logic to find the rows that move out of step and create mark off the islands.

with A as (
    select *,
        date_part('year', dt) * 12   date_part('month', dt)
          - row_number() over (partition by client, product order by dt) as grp
    from T
)
select *,
    row_number()
        over (partition by client, product, grp order by dt) as consecutive_months
from A;

If it's acceptable to have multiple rows falling in the same month for a given client-product then switch row_number() to dense_rank() in both places.

https://dbfiddle.uk/?rdbms=postgres_9.5&fiddle=397a2f3282cab3b70bd7a47d1dc5ea0a

  • Related