SQL Server: Select duplicate rows-CodePudding

I have a table:

personId	Date	location
abc123	15-09-2022	London
abc123	15-09-2022	Nottingham
efg321	12-09-2022	Leeds
abc123	13-09-2022	Birmingham

I want to select and return the duplicate rows based on Date and location columns, for example, in the above table: personId 'abc123' is present at location both 'London' and 'Nottingham' on the same date, so I would like to return these rows.

I have tried this query:

SELECT personId, Date FROM sampleTable GROUP BY personId, Date HAVING COUNT(*) > 1

But it gives me the count. I want the rows with all three columns. Expected result:

personId	Date	location
abc123	15-09-2022	London
abc123	15-09-2022	Nottingham

Can anyone please help me with this? Thanks

CodePudding user response：

Try something like this:

SELECT
    sampleTable.*
FROM
    sampleTable
    INNER JOIN -- acts as a filter here
        (
        SELECT
            personId,
            Date
        FROM
            sampleTable
        GROUP BY
            personId,
            Date
        HAVING
            COUNT(*) > 1
        ) problemTable
         ON sampleTable.personId = problemTable.personId
        AND sampleTable.Date = problemTable.Date
ORDER BY
    sampleTable.personId,
    sampleTable.Date,
    sampleTable.location
;

The derived problemTable calculates personId/Date combos that have multiple sampleTable rows. INNER JOINing sampleTable with problemTable, by nature of an INNER JOIN, returns an abridged version of sampleTable: one that only contains combos found within problemTable as well—and those are the ones you care about!

Using INNER JOIN as a filter mechanism is a common theme in SQL, so keep it in the back of your mind.

CodePudding user response：

Its pretty easy using window functions.

Inner SQL returns same table with extra col that marks duplicate rows. Then outer sql filters rows that has duplicate

inner sql result

personid     date             location          check   
abc123       13-09-2022       Birmingham        1
abc123       15-09-2022       London            2
abc123       15-09-2022       Nottingham        2
efg321       12-09-2022       Leeds             1

final

personid     date             location          check   
abc123       15-09-2022       London            2
abc123       15-09-2022       Nottingham        2

SQL

WITH temp AS (
    SELECT
      personid,
      datecol,
      location,
      COUNT( personid ) OVER (PARTITION BY personid, datecol) AS check
     FROM sampletable
)
SELECT *
FROM temp
WHERE check > 1