I got this question during an interview and I wanted to verify the approach.
Write a SQL statement that calculates the day 1 return rate across all users for each date. The "Day 1 return rate" is the % of distinct users seen on day n, that are also seen on day (n 1). We want to track user return rate day over day.
e.g. For the sample data above: -
- D1RR for 2021-09-01 is 50%.
- D1RR for 2021-09-02 is 0%
*/
My idea is get count of distinct users who logged on day 1. user_id,min(date), then keep calculating over other dates.
CREATE TABLE bi_staging.sessions (
session_ts TIMESTAMP,
user_id string,
app_name string
)
INSERT INTO bi_staging.sessions(session_ts, user_id, app_name)
VALUES
('2021-09-01 06:00:00', '123', 'ios'),
('2021-09-01 12:30:00', '123', 'ios'),
('2021-09-02 06:00:00', '123', 'ios'),
('2021-09-01 06:00:00', 'abc', 'ios'),
('2021-09-03 06:00:00', 'abc', 'ios');
CodePudding user response:
A simple solution can be something like the following one:
NOTE: I've used datetime
instead of timestamp
to simplify my tests. In this solution will be needed a method to convert a timestamp
to a date
DECLARE @temp TABLE (session_ts_date DATE, user_id Varchar(10))
INSERT INTO @temp(session_ts_date, user_id)
SELECT DISTINCT CAST(session_ts AS DATE),
user_id
FROM @sessions
SELECT s.session_ts_date,
CAST(SUM( CASE WHEN t.user_id IS NULL THEN 0 ELSE 1 END) as DECIMAL)/ CAST(count(s.user_id) AS DECIMAL)
FROM @temp s
LEFT JOIN @temp t on s.session_ts_date = DATEADD(day,-1,t.session_ts_date) and s.user_id = t.user_id
GROUP BY s.session_ts_date