I have the following table with some data for illustration purposes:
user_id, creation_date, activity_date, net_revenue, retained_days
1, 2019/01/01, 2019/01/01, 0, 0
2, 2019/01/01, 2019/01/01, 0, 0
1, 2019/01/01, 2019/01/02, 0, 1
2, 2019/01/01, 2019/01/02, 0, 1
1, 2019/01/01, 2019/01/03, 0, 2
2, 2019/01/01, 2019/01/03, 0, 2
1, 2019/01/01, 2019/01/04, 5.5, 3
2, 2019/01/01, 2019/01/04, 0, 3
1, 2019/01/01, 2019/01/05, 5.5, 4
2, 2019/01/01, 2019/01/05, 3, 4
and I need to return the number of retained days when the revenue exceeded 0 for the first time for each user.
In this sample data, the desired output would be
1, 3
2, 4
I was trying stuff along the following lines:
SELECT DISTINCT uid,
CASE
WHEN SUM(net_revenue) >0
THEN SUM(retained_days)
END AS ret_day_fst_purch
FROM table1
GROUP BY 1
and (although I dont need the activity_date to be returned
SELECT DISTINCT uid,
activity_date,
CASE
WHEN SUM(net_revenue) >0
THEN SUM(retained_days)
END AS ret_day_fst_purch
FROM table1
GROUP BY 1,2
Yet it did not work, and I do see why, I just have a hard time to pivot to what I need. Thank you in advance for the advice!
CodePudding user response:
It might be simpler than what you initially thought, as you don't really need to sum and just filter the one case you want per user:
select user_id, min(retained_days) days_to_first_rev
from table1
where net_revenue > 0
group by 1
So you basically are getting the minimum value of retained days in which you have revenue.
CodePudding user response:
Consider below
select *,
date_diff(
parse_date('%Y/%m/%d',
activity_date), parse_date('%Y/%m/%d', creation_date),
day
) as retained_days
from your_table
qualify 1 = row_number() over revenue_days
window revenue_days as (partition by user_id order by sign(net_revenue) desc, creation_date)
if applied to sample data in your question - output is