Home > Net >  How to calculate average omitting duplicates per partition in SQL?
How to calculate average omitting duplicates per partition in SQL?

Time:08-16

I want to calculate the average item count accounting for sub-partitions in each partition.

Sample Data:

id     session      item_count    random_field_1 
1      weoifn2      3             A
1      weoifn2      3             B
1      iuboiwe      2             K
2      oeino33      5             R
2      vergeeg      8             C
2      feooinn      9             P
2      feooinn      9             M

Logic:

  • id = 1: (3 2) / 2 = 2.5
  • id = 2: (5 8 9) / 3 = 7.33

Expected Output:

id      avg
1       2.5
2       7.33

My Query:

SELECT 
   id
 , AVG(item_count) OVER (PARTITION BY id) AS avg
FROM my_table

However, I believe this will factor in duplicates twice, which is unintended. How can I fix my query to only consider one item_count value per session?

CodePudding user response:

Consider below approach

select id, avg(item_count) as avg
from (
  select distinct id, session, item_count
  from your_table
)
group by id           

if applied to sample data in your question - output is

enter image description here

CodePudding user response:

SELECT id, AVG(item_count) OVER (PARTITION BY id) AS avg
FROM (
  SELECT
    id
    , CASE
        WHEN ROW_NUMBER OVER (PARTITION BY id) = 1
          THEN item_count
        ELSE NULL
      END 
        AS item_count
  FROM my_table
)
  • Related