Home > Software engineering >  postgres simple churn/retention
postgres simple churn/retention

Time:07-09

I'm trying to create a simple user retention, I tried a lot of ways searching, but none of them seems efficient and most of them have a lot of calculations that I don't really need.

Input my table example: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=ec68bebf63280023e828ebb6cabd2d89 OR:

DROP TABLE IF EXISTS users_churn;
create table 
users_churn(
id SERIAL PRIMARY KEY, 
users varchar(10) not null,
dates timestamp
);
insert into users_churn(users, dates)
values
('1ab7', '2022-01-04 10:22'),
('1ab7', '2022-04-26 18:30'),
('1ab7', '2022-05-12 20:10'),
('1ab7', '2022-07-02 20:55'),
('3ac5', '2022-02-05 05:12'),
('3ac5', '2022-04-09 07:17'),
('3ac5', '2022-07-03 04:19');
select * from users_churn

Output that I expect:

 ------------------------------------------------- 
|                --DESIRED OUTPUT                 |
 ------------------------------------------------- 
| ('1ab7', '2022-01-04 10:22', 'first_purchase'), |
| ('1ab7', '2022-02-01 00:00', 'churn'),          |
| ('1ab7', '2022-03-01 00:00', 'churn'),          |
| ('1ab7', '2022-04-26 18:30', 'retained'),       |
| ('1ab7', '2022-05-12 20:10', 'retained'),       |
| ('1ab7', '2022-06-01 00:00', 'churn'),          |
| ('1ab7', '2022-07-02 20:55', 'retained'),       |
| ('3ac5', '2022-02-05 05:12', 'first_purchase'), |
| ('3ac5', '2022-03-01 00:00', 'churn'),          |
| ('3ac5', '2022-04-09 07:17', 'retained'),       |
| ('3ac5', '2022-05-01 00:00', 'churn'),          |
| ('3ac5', '2022-06-01 00:00', 'churn'),          |
| ('3ac5', '2022-07-03 04:19', 'retained');       |
 ------------------------------------------------- 

CodePudding user response:

It's not clear what you want done if there is more that one purchase by a user in a month. This version keeps only the first one. If you want to keep all purchases, then get rid of the rn_month stuff:

with churn as (
  select c.users, 
         c.dates, 
         date_trunc('month', c.dates) as mdate,
         row_number() over (partition by c.users 
                                order by c.dates) as rn_all,
         row_number() over (partition by c.users, date_trunc('month', c.dates)
                                order by c.dates) as rn_month
    from users_churn c
), months as (
  select generate_series(
           min(c.mdate),
           current_date,
           interval '1 month'
         )::timestamp as mdate
    from churn c
)
select c.users,
       coalesce(c1.dates, m.mdate) as dates,
       case 
         when c1.rn_all = 1 then 'first_purchase'
         when c1.users is null then 'churn'
         else 'retained'
       end as status
  from churn c
       join months m 
         on m.mdate >= c.mdate 
        and c.rn_all = 1
       left join churn c1 
         on c1.users = c.users
        and c1.mdate = m.mdate
        and c.rn_month = 1
 order by c.users, m.mdate;

db<>fiddle here

  • Related