I have a dataset of customer profiles where I am trying to capture how much revenue they generated until they cancelled their subscription. The issue I am having is that after the customer cancels their subscription, the customer profile still exists in the database and registers as being charged 0. I am trying to create a visualization that shows each customers lifespan in a table up until the month that they cancel.
Here is the data I have:
customer name | customer id | cancelled | charge date | charged amount |
---|---|---|---|---|
gary | 012 | no | 1/1/2022 | 199 |
gary | 012 | no | 2/1/2022 | 199 |
gary | 012 | no | 3/1/2022 | 199 |
gary | 012 | yes | 4/1/2022 | 199 |
gary | 012 | no | 5/1/2022 | 199 |
gary | 012 | no | 6/1/2022 | 199 |
I my desired output would select the first 4 lines above, and get rid of the last two.
I can pull up the data, but not sure where to go from there. So far I have:
select
t.customer_name,
t.customer_id,
t.cancel_flag,
t.revenue_date,
a.revenue,
a.customer_id
from metrics t
inner join drp.daasity_mrr a
on t.customer_id= a.customer_id
Any ideas are much appreciated!!
CodePudding user response:
You can use a window function to distinguish the rows before the cancellation and after it. For example:
select *
from (
select
t.customer_name,
t.customer_id,
t.cancel_flag,
t.revenue_date,
a.revenue,
a.customer_id,
max(t.cancel_flag) over(
partition by t.customer_id
order by t.revenue_date
rows between unbounded preceding and 1 preceding
) as mc
from metrics t
inner join drp.daasity_mrr a on t.customer_id= a.customer_id
) x
where mc = 'no' or mc is null
CodePudding user response:
We can add a running total to see how much each customer paid so far.
select *
,sum(charged_amount) over(partition by customer_id order by charge_date) as running_total
from
(
select customer_name
,customer_id
,cancelled
,charge_date
,case when count(case when cancelled = 'yes' then 1 end) over(partition by customer_id order by charge_date) = 0 then charged_amount end as charged_amount
from t
) t
customer_name | customer_id | cancelled | charge_date | charged_amount | running_total |
---|---|---|---|---|---|
gary | 12 | no | 2022-01-01 | 199 | 199 |
gary | 12 | no | 2022-02-01 | 199 | 398 |
gary | 12 | no | 2022-03-01 | 199 | 597 |
gary | 12 | yes | 2022-04-01 | null | 597 |
gary | 12 | no | 2022-05-01 | null | 597 |
gary | 12 | no | 2022-06-01 | null | 597 |