I have a BigQuery table that looks like this:
User | URL | Sessions
user1 | example.com/1/ | 3000
user2 | example.com/2/ | 4000
user3 | example.com/2/ | 5000
user4 | example.com/1/ | 1000
... | ... | ...
I'm hoping to pull the top performing Users for each URL. So ideally the final output gives me a smaller table with a single User value for each URL that is the top Session driver.
I've attempted a SQL Query like:
SELECT User, URL, ARRAY_AGG(Sessions ORDER BY Sessions DESC LIMIT 1) FROM 'table'
but keep getting an error. Any help is much appreciated!
CodePudding user response:
Assuming I follow your question correctly, you just want to sum all the sessions on a per-URL basis and split out those values per-user? If no users have duplicate URLs, the sum won't actually have anything to aggregate, but it lets you still present it while grouping the other columns.
Give the following a shot:
SELECT
User,
URL,
SUM(Sessions) AS Total_Sessions
FROM `table`
GROUP BY User, URL
ORDER BY Total_Sessions DESC
CodePudding user response:
You will have to do something like using rank or row_number functions :
Sample:
WITH input AS
(SELECT 1 as user, 'x' as url, 100 as session
UNION ALL SELECT 1 as user, 'x' as url, 200 as session
UNION ALL SELECT 1 as user, 'y' as url, 400 as session
UNION ALL SELECT 2 as user, 'x' as url, 200 as session
UNION ALL SELECT 2 as user, 'x' as url, 300 as session
)
select user, url, session from (
SELECT user, url, session,
ROW_NUMBER() OVER (partition by user, url ORDER BY session desc) AS top_rank
FROM input)
where top_rank = 1