Home > Mobile >  How to create a column that shows if a date appears in the last 7 days in BQ?
How to create a column that shows if a date appears in the last 7 days in BQ?

Time:06-17

I've got a table that shows me a user_id and the dates they were active (this is derived from a massive events table). The table looks like this:

user_id active_date
1 2022-06-16
2 2022-06-02
1 2022-06-14
1 2022-05-01

I need to create a query to find if a user has been active in the last 7 days, 8-14 ago, 15-21 days ago, and 22-28 days ago, as well as their first and last active date. I'd like to see the active in the last 7 days column (and the others of that type) as a Boolean if possible.

The first and last active dates are pretty easy, but I need some help with the other dates. This is what I've got so far:

SELECT
  user_id,
  MIN(active_date) as first_action_date,
  MAX(active_date) as last_action_date,


FROM activity_dates
GROUP BY 1

CodePudding user response:

An approach using DATE_DIFF:

WITH activity_dates AS (
  SELECT "1" as user_id, DATE("2022-06-16") as active_date
  UNION ALL
  SELECT "2" as user_id, "2022-06-02" as active_date
  UNION ALL
  SELECT "1" as user_id, "2022-06-14" as active_date
  UNION ALL
  SELECT "1" as user_id, "2022-05-01" as active_date
)
SELECT
  user_id,
  MIN(active_date) as first_action_date,
  MAX(active_date) as last_action_date,
  MAX(IF(DATE_DIFF(CURRENT_DATE(), active_date, DAY) <= 7, active_date, NULL)) IS NOT NULL as active_last_7_days,
  MAX(IF(DATE_DIFF(CURRENT_DATE(), active_date, DAY) BETWEEN 8 AND 14, active_date, NULL)) IS NOT NULL as active_8_14_ago,
  MAX(IF(DATE_DIFF(CURRENT_DATE(), active_date, DAY) BETWEEN 15 AND 21, active_date, NULL)) IS NOT NULL as active_15_21_ago,
FROM activity_dates
GROUP BY user_id

Output:

user_id first_action_date   last_action_date    active_last_7_days  active_8_14_ago active_15_21_ago
1       2022-05-01          2022-06-16          true                false           false
2       2022-06-02          2022-06-02          false               true            false

CodePudding user response:

In order to achieve the date ranges, you could use a case statement such as the following. Identify the most recent login, then use it to compare to todays date minus 7 days, minus 14 days, etc, and return true or false depending on whether it meets the criteria.

  SELECT user_id
   , CASE 
       WHEN lastActive > DATE_ADD(NOW(), INTERVAL -7 DAY) THEN 'TRUE'
       ELSE 'FALSE'
     END AS "oneWeek"
   , CASE 
       WHEN lastActive > DATE_ADD(NOW(), INTERVAL -14 DAY) THEN 'TRUE'
       ELSE 'FALSE'
     END AS "twoWeeks"
  FROM (SELECT user_id
        , MAX(active_date) AS "lastActive" 
        FROM activity_dates 
        GROUP BY id) AS table1;

Here's a fiddle of it in action: enter image description here

You can also consider another option that gives you a number of time user was active within range

select * from (
  select user_id, 
    min(active_date) over(partition by user_id) first_action_date,
    max(active_date) over(partition by user_id) last_action_date,
    bucket
  from activity_dates, (select [7, 14, 21] days),
  unnest([struct(days[safe_offset(range_bucket(date_diff(current_date, active_date, day) - 1, days))] as bucket)])
)
pivot (count(*) active_last_days for bucket in (7, 14, 21))         

with output

enter image description here

  • Related