Home > OS >  How to exclude duplicate records in SQL
How to exclude duplicate records in SQL

Time:04-27

I have a query whose output is like the below:

ID Case_When_Output_Column
Abc123 Search Performed
Abc123 Other activities Performed
EFG456 Search Performed
EFG456 Other activities Performed
PQR999 Other activities Performed
LMN012 Other activities Performed

As you can see, we have ID's where search is performed and also "Other activities" are performed. Eg: Abc123, EFG456

However, I am looking for ID's where only "Other Activities" are performed. (Search should not be performed) Eg: PQR999, LMN012

How do I frame such a query?

CodePudding user response:

You can use an analytic function to count the number of rows for each id and only return the rows where the count is 1 and Case_when_output_column = 'Other activities Performed':

SELECT id,
       Case_when_output_column
FROM   (
  SELECT t.*,
         COUNT(*) OVER (PARTITION BY id) AS cnt
  FROM   table_name t
)
WHERE  Case_when_output_column = 'Other activities Performed'
AND    cnt = 1

If you want any number of 'Other activities Performed' rows for each id then you can count the DISTINCT cases:

SELECT id,
       Case_when_output_column
FROM   (
  SELECT t.*,
         COUNT(DISTINCT Case_when_output_column) OVER (PARTITION BY id) AS cnt
  FROM   table_name t
)
WHERE  Case_when_output_column = 'Other activities Performed'
AND    cnt = 1

Which, for the sample data:

CREATE TABLE table_name (ID, Case_When_Output_Column) AS
SELECT 'Abc123', 'Search Performed' FROM DUAL UNION ALL
SELECT 'Abc123', 'Other activities Performed' FROM DUAL UNION ALL
SELECT 'EFG456', 'Search Performed' FROM DUAL UNION ALL
SELECT 'EFG456', 'Other activities Performed' FROM DUAL UNION ALL
SELECT 'PQR999', 'Other activities Performed' FROM DUAL UNION ALL
SELECT 'LMN012', 'Other activities Performed' FROM DUAL;

Both output:

ID CASE_WHEN_OUTPUT_COLUMN
LMN012 Other activities Performed
PQR999 Other activities Performed

db<>fiddle here

CodePudding user response:

Do you mean to only bring back IDs with only one occurrence, or only bring back IDs with "Other activities Performed" in the output column?

If the former, then you can join to a select statement like so

select t1.ID, t1.Case_When_Output_Column
from table as t1
join (select ID, count(ID) as vol from table group by ID) as t2
on t1.ID = t2.ID
where t2.vol < 2

It the latter, you just need a where clause such as

select *
from table where Case_When_Output_Column = 'Other activities Performed'
  • Related