I want to calculate the average item count accounting for sub-partitions in each partition.
Sample Data:
id session item_count random_field_1
1 weoifn2 3 A
1 weoifn2 3 B
1 iuboiwe 2 K
2 oeino33 5 R
2 vergeeg 8 C
2 feooinn 9 P
2 feooinn 9 M
Logic:
- id = 1: (3 2) / 2 = 2.5
- id = 2: (5 8 9) / 3 = 7.33
Expected Output:
id avg
1 2.5
2 7.33
My Query:
SELECT
id
, AVG(item_count) OVER (PARTITION BY id) AS avg
FROM my_table
However, I believe this will factor in duplicates twice, which is unintended. How can I fix my query to only consider one item_count
value per session?
CodePudding user response:
Consider below approach
select id, avg(item_count) as avg
from (
select distinct id, session, item_count
from your_table
)
group by id
if applied to sample data in your question - output is
CodePudding user response:
SELECT id, AVG(item_count) OVER (PARTITION BY id) AS avg
FROM (
SELECT
id
, CASE
WHEN ROW_NUMBER OVER (PARTITION BY id) = 1
THEN item_count
ELSE NULL
END
AS item_count
FROM my_table
)