Home > Back-end >  Combine SQL query results where any are empty
Combine SQL query results where any are empty

Time:10-29

I'm trying to get the bulk of my data cleaning taken care of using SQL queries. Sometimes we will just need a person to make the call, but in cases where, say, the only differences in data are ones where data is missing, I'd like to be able to just return a single row. Say we have the following data:

| FirstName | LastName | Email              | Phone     |
|:--------- |:-------- |:------------------ |:--------- |
| Joe       | Bloggs   | joe.bl@domain.com  | 9412 3456 |
| Bob       | Jones    | bob.jones@abc.com  | 9498 7654 |
| Bob       | Jones    | bob.jones@abc.com  |           |
| Mary      | Flubbs   |                    |           |
| Mary      | Flubbs   |                    | 9411 1111 |
| Mary      | Flubbs   | marymary@corp.com  |           |
| Taylor    | Minns    | taylor@minnsco.com | 9499 8877 |
| Taylor    | Minns    |                    | 9477 8899 |
| Taylor    | Minns    |                    |           |

Joe only has one record. Bob's records are identical except for a missing phone number. Mary's also have no discrepancies, just empties. Taylor has two different phone numbers, and will therefore still need two records, like so:

| FirstName | LastName | Email              | Phone     |
|:--------- |:-------- |:------------------ |:--------- |
| Joe       | Bloggs   | joe.bl@domain.com  | 9412 3456 |
| Bob       | Jones    | bob.jones@abc.com  | 9498 7654 |
| Mary      | Flubbs   | marymary@corp.com  | 9411 1111 |
| Taylor    | Minns    | taylor@minnsco.com | 9499 8877 |
| Taylor    | Minns    | taylor@minnsco.com | 9477 8899 |

If I simply go SELECT DISTINCT... it will show empties. If I use MAX() or MIN() I'll lose data. Is there a way to collapse records down like this without losing distinct data?

CodePudding user response:

You can either use Cross Apply or Inner Join. Assuming the same names are the same people

Cross Apply

declare @mytable table (Firstname varchar(50), LastName varchar(50), Email varchar(50), Phone varchar(50))

insert into @mytable
values 
('Joe'       , 'Bloggs'   , '[email protected]'  , '9412 3456') ,
('Bob'       , 'Jones'    , '[email protected]'  , '9498 7654') ,
('Bob'       , 'Jones'    , '[email protected]'  , '') ,
('Mary'      , 'Flubbs'   , ''                   , '') ,
('Mary'      , 'Flubbs'  ,  ''                  , '9411 1111') ,
('Mary'      , 'Flubbs'   , '[email protected]'  , '') ,
('Taylor'    , 'Minns'    , '[email protected]' , '9499 8877') ,
('Taylor'    , 'Minns'    ,  ''                  , '9477 8899') ,
('Taylor'    , 'Minns'    ,  ''                  ,  '')

select distinct firstname, lastname, x.Email, y.Phone
from @mytable t
cross apply (  -- this statement will only filter emails
    select email
    from @mytable  e
    where nullif(email,'') is not null
    and e.Firstname  = t.Firstname
    and e.LastName = t.LastName
) x
cross apply ( -- this statement will only filter phone
    select phone
    from @mytable p
    where nullif(phone,'') is not null
    and p.Firstname  = t.Firstname
    and p.LastName = t.LastName
) y

Inner Join

select distinct t.firstname, t.lastname, x.Email, y.Phone
from @mytable t
inner join (
    select Firstname, LastName, email
    from @mytable  e
    where nullif(email,'') is not null
) x on x.Firstname  = t.Firstname
    and x.LastName = t.LastName
inner join (
    select Firstname, LastName, phone
    from @mytable  p
    where nullif(phone,'') is not null
) y on y.Firstname  = t.Firstname
    and y.LastName = t.LastName

CodePudding user response:

You can use CASE expression on Email and Phone to select between the current value (if there is one) or find a value for the current row if it's empty or null.

SELECT *
FROM (
    SELECT 
        FirstName
    ,   LastName
    ,   CASE 
            WHEN ISNULL(p1.Email, '') = '' 
            THEN (
                    SELECT MAX(p2.Email) 
                    FROM Person p2
                    WHERE  
                        p1.FirstName = p2.FirstName 
                    AND p1.LastName  = p2.LastName 
                    AND ISNULL(p2.Email, '') <> ''
                )
            ELSE p1.Email 
        END Email 
    ,   CASE 
            WHEN ISNULL(p1.Phone, '') = '' 
            THEN (
                    SELECT MAX(p2.Phone) 
                    FROM Person p2
                    WHERE  
                        p1.FirstName = p2.FirstName 
                    AND p1.LastName  = p2.LastName 
                    AND ISNULL(p2.Phone, '') <> ''
                )
            ELSE p1.Phone 
        END Phone 
    FROM 
        Person p1
) e 
GROUP BY 
    FirstName
,   LastName
,   Email
,   Phone

SQL Fiddle

  • Related