Home > Software engineering >  Selecting non duplicate columns in Group By
Selecting non duplicate columns in Group By

Time:10-29

I need to look in the Email Address list where there are duplicates

There is a possibility too that the LoginName is identical except a 1 after it. eg. Tom, Tom1, and the CreatedDate is close (Under a minute)

I'm not sure if the CreateDate is fully required, but so far I can get a list of emails with duplicate - which I then use as a where clause for the email:

SELECT * 
FROM WWW_Login nolock
WHERE EmailAddress IN (
                        SELECT EmailAddress FROM WWW_Login
                        GROUP BY EmailAddress
                        HAVING COUNT(EmailAddress) >1 )
    AND CreatedDate >= DATEADD(dd,-10,GETDATE())
ORDER BY LoginName asc

What I would like from this is to only have a list of those account where the LoginName is like itself XOR itself 1

For example:

 -------------- ------------------ ------------------ 
| LoginName    | EmailAddress     | CreatedDate      |
 -------------- ------------------ ------------------ 
| Tom          | [email protected]    | 28/10/2022 13:00 |
 -------------- ------------------ ------------------ 
| Tom1         | [email protected]    | 28/10/2022 13:01 |
 -------------- ------------------ ------------------ 
| Chris        | [email protected]  | 25/10/2022 13:00 |
 -------------- ------------------ ------------------ 
| Chris1       | [email protected]  | 25/10/2022 13:01 |
 -------------- ------------------ ------------------ 
| Post.Malone  | [email protected] | 27/10/2022 12:14 |
 -------------- ------------------ ------------------ 
| Post.Malone1 | [email protected] | 27/10/2022 12:15 |
 -------------- ------------------ ------------------ 

Unfortunately at the moment, I can't get the LoginName to order "at the same time" as email address.

If I order by LoginName, I can get a large group of similar names, but different emails

If I order by EmailAddres, I just get a list of users with two accounts under one email, with the odd smattering of "true" duplicates I'm looking for

Neither are like the above table

CodePudding user response:

Is that what you whant ?

WITH TMP (LoginName, EmailAddress, CreatedDate) AS
(
    SELECT *
    FROM
    (
        VALUES
            ('Tom', '[email protected]', CONVERT(DATETIME,'2022-10-28 13:00')),
            ('Tom1', '[email protected]', CONVERT(DATETIME,'2022-10-28 13:00')),
            ('Tomas', '[email protected]', CONVERT(DATETIME,'2022-10-28 13:00')),
            ('Tomas1', '[email protected]', CONVERT(DATETIME,'2022-10-28 13:00')),
            ('Tomy', '[email protected]', CONVERT(DATETIME,'2022-10-28 13:00')),
            ('Tomson', '[email protected]', CONVERT(DATETIME,'2022-10-28 13:00')),
            ('Tomson2', '[email protected]', CONVERT(DATETIME,'2022-10-28 13:00'))
    )t(LoginName, EmailAddress, CreatedDate)
)

SELECT TMP1.LoginName, TMP1.EmailAddress, TMP1.CreatedDate
FROM TMP AS TMP1
INNER JOIN TMP AS TMP2 ON TMP1.EmailAddress = TMP2.EmailAddress AND (TMP1.LoginName   '1' = TMP2.LoginName OR TMP1.LoginName = TMP2.LoginName   '1')
ORDER BY TMP1.EmailAddress, TMP1.LoginName ASC

from :

| LoginName     | EmailAddress          | CreatedDate               |
|-----------    |-------------------    |-------------------------  |
| Tom           | [email protected]         | 2022-10-28 13:00:00.000   |
| Tom1          | [email protected]         | 2022-10-28 13:00:00.000   |
| Tomas1        | [email protected]       | 2022-10-28 13:00:00.000   |
| Tomas         | [email protected]       | 2022-10-28 13:00:00.000   |
| Tomy          | [email protected]        | 2022-10-28 13:00:00.000   |
| Tomson        | [email protected]      | 2022-10-28 13:00:00.000   |
| Tomson2       | [email protected]     | 2022-10-28 13:00:00.000   |

result :

| LoginName     | EmailAddress      | CreatedDate               |
|-----------    |-----------------  |-------------------------  |
| Tom           | [email protected]     | 2022-10-28 13:00:00.000   |
| Tom1          | [email protected]     | 2022-10-28 13:00:00.000   |
| Tomas         | [email protected]   | 2022-10-28 13:00:00.000   |
| Tomas1        | [email protected]   | 2022-10-28 13:00:00.000   |
  • Related