Home > Enterprise >  SQL Server: Select duplicate rows
SQL Server: Select duplicate rows

Time:09-16

I have a table:

personId Date location
abc123 15-09-2022 London
abc123 15-09-2022 Nottingham
efg321 12-09-2022 Leeds
abc123 13-09-2022 Birmingham

I want to select and return the duplicate rows based on Date and location columns, for example, in the above table: personId 'abc123' is present at location both 'London' and 'Nottingham' on the same date, so I would like to return these rows.

I have tried this query:

SELECT personId, Date FROM sampleTable GROUP BY personId, Date HAVING COUNT(*) > 1

But it gives me the count. I want the rows with all three columns. Expected result:

personId Date location
abc123 15-09-2022 London
abc123 15-09-2022 Nottingham

Can anyone please help me with this? Thanks

CodePudding user response:

Try something like this:

SELECT
    sampleTable.*
FROM
    sampleTable
    INNER JOIN -- acts as a filter here
        (
        SELECT
            personId,
            Date
        FROM
            sampleTable
        GROUP BY
            personId,
            Date
        HAVING
            COUNT(*) > 1
        ) problemTable
         ON sampleTable.personId = problemTable.personId
        AND sampleTable.Date = problemTable.Date
ORDER BY
    sampleTable.personId,
    sampleTable.Date,
    sampleTable.location
;

The derived problemTable calculates personId/Date combos that have multiple sampleTable rows. INNER JOINing sampleTable with problemTable, by nature of an INNER JOIN, returns an abridged version of sampleTable: one that only contains combos found within problemTable as well—and those are the ones you care about!

Using INNER JOIN as a filter mechanism is a common theme in SQL, so keep it in the back of your mind.

CodePudding user response:

Its pretty easy using window functions.

Inner SQL returns same table with extra col that marks duplicate rows. Then outer sql filters rows that has duplicate

inner sql result

personid     date             location          check   
abc123       13-09-2022       Birmingham        1
abc123       15-09-2022       London            2
abc123       15-09-2022       Nottingham        2
efg321       12-09-2022       Leeds             1

final

personid     date             location          check   
abc123       15-09-2022       London            2
abc123       15-09-2022       Nottingham        2

SQL

WITH temp AS (
    SELECT
      personid,
      datecol,
      location,
      COUNT( personid ) OVER (PARTITION BY personid, datecol) AS check
     FROM sampletable
)
SELECT *
FROM temp
WHERE check > 1
  • Related