I've got a table that shows me a user_id and the dates they were active (this is derived from a massive events table). The table looks like this:
user_id | active_date |
---|---|
1 | 2022-06-16 |
2 | 2022-06-02 |
1 | 2022-06-14 |
1 | 2022-05-01 |
I need to create a query to find if a user has been active in the last 7 days, 8-14 ago, 15-21 days ago, and 22-28 days ago, as well as their first and last active date. I'd like to see the active in the last 7 days column (and the others of that type) as a Boolean if possible.
The first and last active dates are pretty easy, but I need some help with the other dates. This is what I've got so far:
SELECT
user_id,
MIN(active_date) as first_action_date,
MAX(active_date) as last_action_date,
FROM activity_dates
GROUP BY 1
CodePudding user response:
An approach using DATE_DIFF
:
WITH activity_dates AS (
SELECT "1" as user_id, DATE("2022-06-16") as active_date
UNION ALL
SELECT "2" as user_id, "2022-06-02" as active_date
UNION ALL
SELECT "1" as user_id, "2022-06-14" as active_date
UNION ALL
SELECT "1" as user_id, "2022-05-01" as active_date
)
SELECT
user_id,
MIN(active_date) as first_action_date,
MAX(active_date) as last_action_date,
MAX(IF(DATE_DIFF(CURRENT_DATE(), active_date, DAY) <= 7, active_date, NULL)) IS NOT NULL as active_last_7_days,
MAX(IF(DATE_DIFF(CURRENT_DATE(), active_date, DAY) BETWEEN 8 AND 14, active_date, NULL)) IS NOT NULL as active_8_14_ago,
MAX(IF(DATE_DIFF(CURRENT_DATE(), active_date, DAY) BETWEEN 15 AND 21, active_date, NULL)) IS NOT NULL as active_15_21_ago,
FROM activity_dates
GROUP BY user_id
Output:
user_id first_action_date last_action_date active_last_7_days active_8_14_ago active_15_21_ago
1 2022-05-01 2022-06-16 true false false
2 2022-06-02 2022-06-02 false true false
CodePudding user response:
In order to achieve the date ranges, you could use a case statement such as the following. Identify the most recent login, then use it to compare to todays date minus 7 days, minus 14 days, etc, and return true or false depending on whether it meets the criteria.
SELECT user_id
, CASE
WHEN lastActive > DATE_ADD(NOW(), INTERVAL -7 DAY) THEN 'TRUE'
ELSE 'FALSE'
END AS "oneWeek"
, CASE
WHEN lastActive > DATE_ADD(NOW(), INTERVAL -14 DAY) THEN 'TRUE'
ELSE 'FALSE'
END AS "twoWeeks"
FROM (SELECT user_id
, MAX(active_date) AS "lastActive"
FROM activity_dates
GROUP BY id) AS table1;
Here's a fiddle of it in action:
You can also consider another option that gives you a number of time user was active within range
select * from (
select user_id,
min(active_date) over(partition by user_id) first_action_date,
max(active_date) over(partition by user_id) last_action_date,
bucket
from activity_dates, (select [7, 14, 21] days),
unnest([struct(days[safe_offset(range_bucket(date_diff(current_date, active_date, day) - 1, days))] as bucket)])
)
pivot (count(*) active_last_days for bucket in (7, 14, 21))
with output