Home > Software engineering >  How to aggregate data for logic and show non-aggregated results in the output?
How to aggregate data for logic and show non-aggregated results in the output?

Time:06-28

I have a table that looks like below:

person fruit date
A apple xxxx
B banana xxxx
C apple xxxx
A banana xxxx
C apple xxxx
B banana xxxx

I am interested in persons who have more than one banana in the data set. In this case, it would be person B. I understand how to achieve this by aggregating the data. However, if I want my result to be NOT agrregated and look something like below, what would be the best way?

person fruit date
B banana xxxx
B banana xxxx

CodePudding user response:

You can use a window function such as SUM() OVER () along with a conditional aggregation :

SELECT person, fruit, date
  FROM (SELECT SUM(CASE
                     WHEN fruit = 'banana' THEN
                          1
                     END) OVER( PARTITION BY person ) AS cnt
          FROM t) tt
 WHERE cnt > 1

CodePudding user response:

You can achieve this by using window function in subquery (in this case with count_if aggregate function) and filtering the result:

-- sample data
WITH dataset(person, fruit, date) AS (
    VALUES ('A',    'apple' ,'xxxx'),
    ('B',   'banana'    ,'xxxx'),
    ('C',   'apple' ,'xxxx'),
    ('A',   'banana'    ,'xxxx'),
    ('C',   'apple' ,'xxxx'),
    ('B',   'banana'    ,'xxxx')
)

-- query
select person, fruit, date
from (select *,
          count_if(fruit = 'banana') over(partition by person) banana_count
      from dataset)
where banana_count > 1

Output:

person fruit date
B banana xxxx
B banana xxxx
  • Related