Home > database >  Select Top Performers From BigQuery Table
Select Top Performers From BigQuery Table

Time:05-19

I have a BigQuery table that looks like this:

User  | URL            | Sessions
user1 | example.com/1/ | 3000
user2 | example.com/2/ | 4000
user3 | example.com/2/ | 5000
user4 | example.com/1/ | 1000
...   | ...            | ...

I'm hoping to pull the top performing Users for each URL. So ideally the final output gives me a smaller table with a single User value for each URL that is the top Session driver.

I've attempted a SQL Query like:

SELECT User, URL, ARRAY_AGG(Sessions ORDER BY Sessions DESC LIMIT 1) FROM 'table'

but keep getting an error. Any help is much appreciated!

CodePudding user response:

Assuming I follow your question correctly, you just want to sum all the sessions on a per-URL basis and split out those values per-user? If no users have duplicate URLs, the sum won't actually have anything to aggregate, but it lets you still present it while grouping the other columns.

Give the following a shot:

SELECT 
  User,
  URL,
  SUM(Sessions) AS Total_Sessions
FROM `table`
GROUP BY User, URL
ORDER BY Total_Sessions DESC

CodePudding user response:

You will have to do something like using rank or row_number functions :

Sample:

WITH input AS
 (SELECT 1 as user, 'x' as url, 100 as session
  UNION ALL SELECT 1 as user, 'x' as url, 200 as session
  UNION ALL SELECT 1 as user, 'y' as url, 400 as session
  UNION ALL SELECT 2 as user, 'x' as url, 200 as session
  UNION ALL SELECT 2 as user, 'x' as url, 300 as session
)
select user, url, session from (
SELECT user, url, session,
  ROW_NUMBER() OVER (partition by user, url ORDER BY session desc) AS top_rank
FROM input)
where top_rank = 1 
  • Related