I have a table that looks like below:
person | fruit | date |
---|---|---|
A | apple | xxxx |
B | banana | xxxx |
C | apple | xxxx |
A | banana | xxxx |
C | apple | xxxx |
B | banana | xxxx |
I am interested in persons who have more than one banana in the data set. In this case, it would be person B. I understand how to achieve this by aggregating the data. However, if I want my result to be NOT agrregated and look something like below, what would be the best way?
person | fruit | date |
---|---|---|
B | banana | xxxx |
B | banana | xxxx |
CodePudding user response:
You can use a window function such as SUM() OVER ()
along with a conditional aggregation :
SELECT person, fruit, date
FROM (SELECT SUM(CASE
WHEN fruit = 'banana' THEN
1
END) OVER( PARTITION BY person ) AS cnt
FROM t) tt
WHERE cnt > 1
CodePudding user response:
You can achieve this by using window function in subquery (in this case with count_if
aggregate function) and filtering the result:
-- sample data
WITH dataset(person, fruit, date) AS (
VALUES ('A', 'apple' ,'xxxx'),
('B', 'banana' ,'xxxx'),
('C', 'apple' ,'xxxx'),
('A', 'banana' ,'xxxx'),
('C', 'apple' ,'xxxx'),
('B', 'banana' ,'xxxx')
)
-- query
select person, fruit, date
from (select *,
count_if(fruit = 'banana') over(partition by person) banana_count
from dataset)
where banana_count > 1
Output:
person | fruit | date |
---|---|---|
B | banana | xxxx |
B | banana | xxxx |