Home > front end >  Users that played in X different dates - SQL Standard BigQuery
Users that played in X different dates - SQL Standard BigQuery

Time:11-13

I have the following schema of a data model (I only have the schema, not the tables) on BigQuery with SQL Standard.

I have created this query to select the Top 10 users that generated more revenue in the last three months on the Love game:

SELECT
  users.user_id,
  SUM(pay.amount) AS total_rev
FROM
  `my-database.User` AS users
INNER JOIN
  `my-database.IAP_events` AS pay
ON
  users.User_id = pay.User_id
INNER JOIN
  `my-database.Games` AS games
ON
  users.Game_id = games.Game_id
WHERE
  games.game_name = "Love"
GROUP BY
  users.user_id
ORDER BY
  total_rev ASC
LIMIT
  10

But then, the exercise says to only consider users that played during 10 different days in the last 3 months. I understand I would use a subquery with a count in the dates but I am a little lost on how to do it...

Thanks a lot!

CodePudding user response:

EDIT: You need to count distinct dates, not transactions, so in the qualify clause you'll need to state COUNT(DISTINCT date_) OVER ... instead of COUNT(transaction_id) OVER .... Fixed the code already.

As far as I understood, you need to count the distinct transaction_id inside IAP_Events on a 3 previous months window, check that the count is greater than 10, and then sum the amounts of all the users included in that constraint.

To do so, you can use BigQuery's analytic functions, aka window functions:

with window_counting as (
  select
    user_id,
    amount
  from 
    iap_events
  where
    date_ >= date_sub(current_date(), interval 3 month)
  qualify 
    count(distinct date_) over (partition by user_id) > 10
  
),
final as (
  select
    user_id,
    sum(amount)
  from
    window_counting
  group by
    1
  order by
    2 desc
  limit 10
)
select * from final

You will just need to add the needed joins inside the first CTE in order to filter by game_name :)

  • Related