I have the data in a format like this from SQL query
No | ID | Value | date |
---|---|---|---|
1 | A | xx | 09-Aug-22 |
1 | A | xx | 10-Aug-22 |
2 | B | xx | 09-Aug-22 |
2 | B | yy | 10-Aug-22 |
3 | C | xx | 10-Aug-22 |
Out of this I want to remove the data which is duplicate in column Value and Single row for same NO and ID SO My final output should be like
No | ID | Value | date |
---|---|---|---|
2 | B | xx | 09-Aug-22 |
2 | B | yy | 10-Aug-22 |
How this can be achieved
CodePudding user response:
This is how I understood it:
Sample data:
SQL> WITH
2 test (no,
3 id,
4 VALUE,
5 datum)
6 AS
7 (SELECT 1, 'A', 'xx', DATE '2022-08-09' FROM DUAL
8 UNION ALL
9 SELECT 1, 'A', 'xx', DATE '2022-08-10' FROM DUAL
10 UNION ALL
11 SELECT 2, 'B', 'xx', DATE '2022-08-09' FROM DUAL
12 UNION ALL
13 SELECT 2, 'B', 'yy', DATE '2022-08-10' FROM DUAL
14 UNION ALL
15 SELECT 3, 'C', 'xx', DATE '2022-08-10' FROM DUAL),
Count number of distinct values per no, id
and total number of rows per no, id
:
16 temp
17 AS
18 ( SELECT no,
19 id,
20 COUNT (DISTINCT VALUE) cnt_dv,
21 COUNT (*) cnt
22 FROM test
23 GROUP BY no, id)
Finally, exclude rows for which total number of rows is 1
(that's ID = C
) and rows for which number of distinct values is different from total number of rows (that's A
):
24 SELECT *
25 FROM test
26 WHERE (no, id, VALUE) NOT IN (SELECT no, id, VALUE
27 FROM temp
28 WHERE cnt = 1
29 OR cnt_dv <> cnt)
30 ORDER BY no, id, VALUE;
NO ID VALUE DATUM
---------- -- ---------- ----------
2 B xx 09.08.2022
2 B yy 10.08.2022
SQL>
CodePudding user response:
You may use window finctions to calculate number of rows per group and then filter out rows you do not want.
with prep as ( select t.* /*Identify No with more that one row*/ , count(1) over( partition by no ) as count_per_group /*Identify rows with duplicates*/ , count(1) over( partition by no, id, value ) as count_duplicates from t ) select No, ID, Value, dt from prep where count_per_group > 1 and count_duplicates = 1
NO ID VALUE DT 2 B xx 09-Aug-22 2 B yy 10-Aug-22
db<>fiddle here
CodePudding user response:
You can use COUNT OVER
to see whether there is more than one value for a no/id.
select no, id, value, dat
from
(
select
no, id, value, dat,
count(distinct value) over (partition by no, id) as cnt
from mytable
)
where cnt > 1
order by no, id, dat;
(I've replaced date with dat, because date is a keyword in SQL. Put in your real column name there.)