I'm using the BigQuery to get users' annual spending to append upon a transaction ID. I'm not sure why I'd get the SELECT list expression references column price which is neither grouped nor aggregated at [4:7]
As the price column is the aggregated target, why should I group by it?
SELECT
trans_id,
user_id,
SUM(price) OVER (PARTITION BY user_id, EXTRACT(year FROM date)) as user_annual_spend
FROM sample_sales_table
GROUP BY trans_id, user_id
trans_is [STRING NULLABLE], user_id [STRING NULLABLE], date [DATE NULLABLE] yyyy-mm-dd format, price [NUMERIC NULLABLE]
trans_id | user_id | date | price
AAAAAAA | user1 |2022-01-09| 20
AAAAAAC | user1 |2022-05-01| 50
BAAAAAB | user1 |2021-01-05| 10
BAAAACB | user1 |2020-01-01| 40
BAAAAAC | user2 |2021-02-01| 30
Ideal output
trans_id | user_id |user_annual_spend
AAAAAAA | user1 | 70
AAAAAAC | user1 | 70
BAAAAAB | user1 | 10
BAAAACB | user1 | 40
BAAAAAC | user2 | 30
CodePudding user response:
Modify your query as the follwing:
SELECT trans_id, user_id,
SUM(user_annual_spend) AS user_annual_spend
FROM
(
SELECT trans_id, user_id,
SUM(price) OVER (PARTITION BY user_id, EXTRACT(year FROM date)) as user_annual_spend
FROM sample_sales_table
) T
GROUP BY trans_id, user_id
SUM(price) OVER (PARTITION BY user_id, EXTRACT(year FROM date))
is a window function not aggregate function.
Or simply remove the group by:
SELECT trans_id, user_id,
SUM(price) OVER (PARTITION BY user_id, EXTRACT(year FROM date)) as user_annual_spend
FROM sample_sales_table
ORDER BY user_id, trans_id