Exclude values based on duplicate values in onecolumns-CodePudding

I have the data in a format like this from SQL query

No	ID	Value	date
1	A	xx	09-Aug-22
1	A	xx	10-Aug-22
2	B	xx	09-Aug-22
2	B	yy	10-Aug-22
3	C	xx	10-Aug-22

Out of this I want to remove the data which is duplicate in column Value and Single row for same NO and ID SO My final output should be like

No	ID	Value	date
2	B	xx	09-Aug-22
2	B	yy	10-Aug-22

How this can be achieved

CodePudding user response：

This is how I understood it:

Sample data:

SQL> WITH
  2     test (no,
  3           id,
  4           VALUE,
  5           datum)
  6     AS
  7        (SELECT 1, 'A', 'xx', DATE '2022-08-09' FROM DUAL
  8         UNION ALL
  9         SELECT 1, 'A', 'xx', DATE '2022-08-10' FROM DUAL
 10         UNION ALL
 11         SELECT 2, 'B', 'xx', DATE '2022-08-09' FROM DUAL
 12         UNION ALL
 13         SELECT 2, 'B', 'yy', DATE '2022-08-10' FROM DUAL
 14         UNION ALL
 15         SELECT 3, 'C', 'xx', DATE '2022-08-10' FROM DUAL),

Count number of distinct values per no, id and total number of rows per no, id:

 16     temp
 17     AS
 18        (  SELECT no,
 19                  id,
 20                  COUNT (DISTINCT VALUE) cnt_dv,
 21                  COUNT (*) cnt
 22             FROM test
 23         GROUP BY no, id)

Finally, exclude rows for which total number of rows is 1 (that's ID = C) and rows for which number of distinct values is different from total number of rows (that's A):

 24    SELECT *
 25      FROM test
 26     WHERE (no, id, VALUE) NOT IN (SELECT no, id, VALUE
 27                                     FROM temp
 28                                    WHERE    cnt = 1
 29                                          OR cnt_dv <> cnt)
 30  ORDER BY no, id, VALUE;

        NO ID VALUE      DATUM
---------- -- ---------- ----------
         2 B  xx         09.08.2022
         2 B  yy         10.08.2022

SQL>

CodePudding user response：

You may use window finctions to calculate number of rows per group and then filter out rows you do not want.

with prep as (
  select
    t.*
    /*Identify No with more that one row*/
    , count(1) over(
        partition by no
      ) as count_per_group
    /*Identify rows with duplicates*/
    , count(1) over(
        partition by no, id, value
      ) as count_duplicates
  from t
)
select No, ID, Value, dt
from prep
where count_per_group > 1
  and count_duplicates = 1

NO	ID	VALUE	DT
2	B	xx	09-Aug-22
2	B	yy	10-Aug-22

db<>fiddle here

CodePudding user response：

You can use COUNT OVER to see whether there is more than one value for a no/id.

select no, id, value, dat
from
(
  select
    no, id, value, dat,
    count(distinct value) over (partition by no, id) as cnt
  from mytable
)
where cnt > 1
order by no, id, dat;

(I've replaced date with dat, because date is a keyword in SQL. Put in your real column name there.)