Hi my company wants to better tracks how many users are active on our platform. We are using Microsoft SQL Server 2019 as the Database, connected to the Azure Data Studio.
Below are two tables DDLs from our DB:
- CALENDAR TABLE
COLUMN | DATA TYPE | DETAILS |
---|---|---|
CALENDAR_DATE | DATE NOT NULL | Base date (YYYY-MM-DD) |
CALENDAR_YEAR | INTEGER NOT NULL | 2010, 2011 etc |
CALENDAR_MONTH_NUMBER | INTEGER NOT NULL | 1-12 |
CALENDAR_MONTH_NAME | VARCHAR(100) | January, February etc |
CALENDAR_DAY_OF_MONTH | INTEGER NOT NULL | 1-31 |
CALENDAR_DAY_OF_WEEK | INTEGER NOT NULL | 1-7 |
CALENDAR_DAY_NAME | INTEGER NOT NULL | Monday, Tuesday etc |
CALENDAR_YEAR_MONTH | INTEGER NOT NULL, | 201011, 201012, 201101 etc |
- REVENUE ANALYSIS
Column | Data Type | Details |
---|---|---|
ACTIVITY_DATE | DATE NOT NULL | Date Wager was made |
MEMBER_ID | INTEGER NOT NULL | Unique Player identifier |
GAME_ID | SMALLINT NOT NULL | Unique Game identifier |
WAGER_AMOUNT | REAL NOT NULL | Total amount wagered on the game |
NUMBER_OF_WAGERS | INTEGER NOT NULL | Number of wagers on the game |
WIN_AMOUNT | REAL NOT NULL | Total amount won on the game |
ACTIVITY_YEAR_MONTH | INTEGER NOT NULL | YYYYMM |
BANK_TYPE_ID | SMALL INT DEFAULT 0 NOT NULL, | 0=Real money, 1=Bonus money |
Screenshot for both tables below:
CALENDAR TABLE
REVENUE ANALYSIS TABLE
Long story short "active" means that the member has made a minimum of one real money wager in the month.
Every month a member has a certain lifecycle type. This status will change on a monthly basis on their previous and current months activity. The statuses are the following:
NEW | First time they placed a real money wager |
---|---|
RETAINED | Active in the prior calendar month and the current calendar month |
UNRETAINED | Active in the prior calendar month but not active in the current calendar month |
REACTIVATED | Not active in the prior calendar month, but active in the current calendar month |
LAPSED | Not active in the prior calendar month or the current calendar month |
We would like initially to get to a view with the columns below:
MEMBER_ID | CALENDAR_YEAR_MONT | MEMBER_LIFECYCLE_STATUS | LAPSED_MONTHS
Also the view should display one row per member per month, starting from the month in which they first placed a real money wager. This view should give their lifecycle status for that month, and if the member has lapsed, it should show a rolling count of the number of months since they were last active.
So far I have come up with the following CTE to give me a basis for the view. However I am not sure about the UNRETAINED and REACTIVATED columns. Any ideas anyone?
with all_activities as (
select a.member_id, activity_date, calendar_month_number as month_activity, calendar_year as year_activity,
datepart(month,CURRENT_TIMESTAMP) as current_month, datepart(year,CURRENT_TIMESTAMP) as current_year,
datepart(month,CONVERT(DATE, DATEADD(DAY,-DAY(GETDATE()),GETDATE()))) as previous_month, datepart(year,CONVERT(DATE, DATEADD(DAY,-DAY(GETDATE()),GETDATE()))) as year_last_month,
a.NUMBER_OF_WAGERS, (case when datepart(month,CURRENT_TIMESTAMP) = calendar_month_number and datepart(year,CURRENT_TIMESTAMP) = calendar_year then 'active' else 'inactive' end) as status,
case when (case when datepart(month,CURRENT_TIMESTAMP) = calendar_month_number and datepart(year,CURRENT_TIMESTAMP) = calendar_year then 'active' else 'inactive' end) = 'active' and number_of_wagers = 1 then 'New'
when (LAG((case when datepart(month,CURRENT_TIMESTAMP) = calendar_month_number and datepart(year,CURRENT_TIMESTAMP) = calendar_year then 'active' else 'inactive' end) ,1,0) OVER(PARTITION BY member_id ORDER BY calendar_month_number desc) = 'active' and calendar_month_number = datepart(month,CONVERT(DATE, DATEADD(DAY,-DAY(GETDATE()),GETDATE())))) then 'Retained'
when (calendar_month_number = datepart(month,CURRENT_TIMESTAMP) and year_activity = datepart(year,CURRENT_TIMESTAMP) and calendar_month_number = datepart(month,CONVERT(DATE, DATEADD(DAY,-DAY(GETDATE()),GETDATE())))) then 'Unretained'
from [dbo].[REVENUE_ANALYSIS] a
join CALENDAR b on a.ACTIVITY_DATE= b.CALENDAR_DATE
)
select * from all_activities
CodePudding user response:
This is about customer lifecycle status analysis, which requires a couple of things:
- customer acquisition date (it'll be nice to have this stored because some customers may go back to years or tens of years). For this question, we assume
revenue_analysis
has everthing we need and to calculateuser acquisition month
lapsed
vschurned
: achurned
customer is usually defined no activity for a period of time. For this question, we don't have the definition, thus, a user will be reported aslapsed
forever.- For life cycle status calculation, we're going to gather the following (member_id, calendar_month, acquisition_month, activity_month, prior_activity_month), so that we can calculate the final result.
with cte_new_user_monthly as (
select member_id,
min(activity_year_month) as acquisition_month
from revenue_analysis
group by 1),
cte_user_monthly as (
select u.member_id,
u.acquisition_month,
m.yyyymm as calendar_month
from cte_new_user_monthly u,
calendar_month m
where u.acquisition_month <= m.yyyymm),
cte_user_activity_monthly as (
select f.member_id,
f.activity_year_month as activity_month
from revenue_analysis f
group by 1,2),
cte_user_lifecycle as (
select u.member_id,
u.calendar_month,
u.acquisition_month,
m.activity_month
from cte_user_monthly u
left
join cte_user_activity_monthly m
on u.member_id = m.member_id
and u.calendar_month = m.activity_month),
cte_user_status as (
select member_id,
calendar_month,
acquisition_month,
activity_month,
lag(activity_month,1) over (partition by member_id order by calendar_month) as prior_activity_month
from cte_user_lifecycle),
user_status_monthly as (
select member_id,
calendar_month,
activity_month,
case
when calendar_month = acquisition_month then 'NEW'
when prior_activity_month is not null and activity_month is not null then 'RETAINED'
when prior_activity_month is not null and activity_month is null then 'UNRETAINED'
when prior_activity_month is null and activity_month is not null then 'REACTIVATED'
when prior_activity_month is null and activity_month is null then 'LAPSED'
else null
end as user_status
from cte_user_status)
select member_id,
calendar_month,
activity_month,
user_status,
row_number() over (partition by member_id, user_status order by calendar_month) as months
from user_status_monthly
order by 1,2;
Result (include activity_month
for easy understanding):
member_id|calendar_month|activity_month|user_status|months|
--------- -------------- -------------- ----------- ------
1001| 201701| 201701|NEW | 1|
1001| 201702| |UNRETAINED | 1|
1001| 201703| |LAPSED | 1|
1001| 201704| |LAPSED | 2|
1001| 201705| 201705|REACTIVATED| 1|
1001| 201706| 201706|RETAINED | 1|
1001| 201707| |UNRETAINED | 2|
1001| 201708| |LAPSED | 3|
1001| 201709| 201709|REACTIVATED| 2|
1001| 201710| |UNRETAINED | 3|
1001| 201711| |LAPSED | 4|
1001| 201712| 201712|REACTIVATED| 3|
1002| 201703| 201703|NEW | 1|
1002| 201704| |UNRETAINED | 1|
1002| 201705| |LAPSED | 1|
1002| 201706| |LAPSED | 2|
1002| 201707| |LAPSED | 3|
1002| 201708| |LAPSED | 4|
1002| 201709| |LAPSED | 5|
1002| 201710| |LAPSED | 6|
1002| 201711| |LAPSED | 7|
1002| 201712| |LAPSED | 8|
EDIT:
- Codes tested in MySQL because I didn't notice 'mysql' tag was removed.
calendar_month
in the code can be derived from thecalendar
dimension.