So, I'm not sure how it works and I haven't found a sufficient answer by googleing (probably not using the right buzz words). So here it comes: Let's say I have a table like this, let's call it persons
ID | Name | First Name | Country |
---|---|---|---|
1 | Doe | John | USA |
2 | Doe | John | UK |
3 | Doe | John | Brazil |
4 | Meyer | Julia | Germany |
5 | Meyer | Julia | Austria |
6 | Picard | Jean-Luc | France |
7 | Picard | Jean-Luc | UK |
8 | Nakamura | Hikaro | Japan |
Ok, so now I want to select all the rows that have the same name and first name and where at least one country is the UK. So my result set should look like this.
ID | Name | First_Name | Country |
---|---|---|---|
1 | Doe | John | USA |
2 | Doe | John | UK |
3 | Doe | John | Brazil |
6 | Picard | Jean-Luc | France |
7 | Picard | Jean-Luc | UK |
I mean, I know how to find doubles in general like this
SELECT *
FROM persons p1
JOIN (SELECT NAME, FIRST_NAME, count(*) FROM PERSONS
GROUP BY FIRST_NAME, NAME having count(*) >1) p2
ON p1.NAME = p2.NAME
AND p1.FIRST_NAME = p2.FIRST_NAME;
But this also results in having Julia Meyer in there and I don't want her in there.
Any suggestions?
CodePudding user response:
Count country of interest conditionally
SELECT *
FROM persons p1
JOIN (
SELECT NAME, FIRST_NAME
FROM PERSONS
GROUP BY FIRST_NAME, NAME
having count(*) > 1 and count(case country = 'UK' then 1 end) >= 1
) p2 ON p1.NAME = p2.NAME
AND p1.FIRST_NAME = p2.FIRST_NAME;
CodePudding user response:
Use EXISTS
:
SELECT p1.*
FROM persons p1
WHERE EXISTS (
SELECT *
FROM persons p2
WHERE p2.ID <> p1.ID
AND p2.Name = p1.Name AND p2.FirstName = p1.FirstName
AND 'UK' IN (p1.Country, p2.Country)
);
See the demo.
CodePudding user response:
I want to select all the rows that have the same name and first name and where at least one country is the UK.
You can use the COUNT
analytic function with conditional aggregation (and solve the problem in a single table scan and without any self-joins):
SELECT id, name, first_name, country
FROM (
SELECT t.*,
COUNT(CASE country WHEN 'UK' THEN 1 END)
OVER (PARTITION BY name, first_name) AS cnt
FROM table_name t
)
WHERE cnt > 0;
Which, for the sample data:
CREATE TABLE table_name (ID, Name, First_Name, Country) AS
SELECT 1, 'Doe', 'John', 'USA' FROM DUAL UNION ALL
SELECT 2, 'Doe', 'John', 'UK' FROM DUAL UNION ALL
SELECT 3, 'Doe', 'John', 'Brazil' FROM DUAL UNION ALL
SELECT 4, 'Meyer', 'Julia', 'Germany' FROM DUAL UNION ALL
SELECT 5, 'Meyer', 'Julia', 'Austria' FROM DUAL UNION ALL
SELECT 6, 'Picard', 'Jean-Luc', 'France' FROM DUAL UNION ALL
SELECT 7, 'Picard', 'Jean-Luc', 'UK' FROM DUAL UNION ALL
SELECT 8, 'Nakamura', 'Hikaro', 'Japan' FROM DUAL;
Outputs:
ID NAME FIRST_NAME COUNTRY 1 Doe John USA 2 Doe John UK 3 Doe John Brazil 6 Picard Jean-Luc France 7 Picard Jean-Luc UK
If you want to find duplicate rows where at least one is UK
then also count all the rows in the partition:
SELECT id, name, first_name, country
FROM (
SELECT t.*,
COUNT(CASE country WHEN 'UK' THEN 1 END)
OVER (PARTITION BY name, first_name) AS cnt_uk,
COUNT(*)
OVER (PARTITION BY name, first_name) AS cnt_all
FROM table_name t
)
WHERE cnt_uk > 0
AND cnt_all >= 2;
Which gives the same output for the sample data.
db<>fiddle here
CodePudding user response:
There are 2 conditions, 1) containing 'UK' AND 2) having count(1) > 1. So, this query below would work.
SELECT p1.*
FROM persons p1
WHERE (p1.NAME, p1.FIRST_NAME) IN (
SELECT p2.NAME, p2.FIRST_NAME
FROM persons p2
WHERE p2.Country = 'UK') AND
AND (p1.NAME, p1.FIRST_NAME) IN (
SELECT p3.NAME, p3.FIRST_NAME
FROM persons p3
GROUP BY p3.NAME, p3.FIRST_NAME
HAVING COUNT(1) > 1)