Union based on filter condition-CodePudding

I got two tables. I wanted to get rid of the duplicates only if NAME, ID AND LOC matches. If Name,ID and Loc match exists in both the tables then i wanted to pick the record from the latest month (tb1_march in this case). My input table also has dupes!

TB1_FEB

|NAME | ID | DETAIL | LOC | AGE |
|MARK | 1  | EMAIL  | USA | 34 |
|JEN  | 2  | PHONE  | USA | 32 |
|ROGER| 6  | EMAIL  | UK  | 34 |

TB2_MAR
|NAME | ID | DETAIL | LOC | AGE |
|MARK | 1  | PHONE  | USA | 35  |
|JEN  | 2  | PHONE  | USA | 32 |
|STEVE| 8  | EMAIL  | UK  | 29  |

DESIRED OUTPUT:

|NAME | ID | DETAIL | LOC | AGE |
|MARK | 1  | PHONE  | USA | 35  |
|JEN  | 2  | PHONE  | USA | 32 |
|ROGER| 6  | EMAIL  | UK  | 34 |
|STEVE| 8  | EMAIL  | UK  | 29  |

What i tried:

WITH SAMPLE_1 AS (
SELECT T2.* FROM TB2_MAR T2 LEFT JOIN TB1_FEB T1 ON
T1.NAME = T2.NAME
AND
T1.ID = T2.ID
AND
T1.LOC = T2.LOC
WHERE T1.ID IS NULL
),
SAMPLE_2 AS (
SELECT * FROM  SAMPLE_1
UNION
SELECT * FROM TB1_FEB 
)
SELECT * FROM SAMPLE_2

|NAME | ID | DETAIL | LOC | AGE |
|MARK | 1  | EMAIL  | USA | 34 |
|MARK | 1  | PHONE  | USA | 35  |
|JEN  | 2  | PHONE  | USA | 32 |
|ROGER| 6  | EMAIL  | UK  | 34 |
|STEVE| 8  | EMAIL  | UK  | 29  |

CodePudding user response：

The tables do not seem to contain any column that would help in final sorting so you need to add it; and use window functions to find latest row:

WITH cte1 AS (
    SELECT *, 1 AS prioritee
    FROM tb2_mar
    UNION ALL
    SELECT *, 2 AS prioritee
    FROM tb1_feb 
), cte2 AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY name, id, loc ORDER BY prioritee) AS rn
    FROM cte1
)
SELECT *
FROM cte2
WHERE rn = 1

CodePudding user response：

As Salman already pointed out, you are missing the month information in your table, so I will add that. Also, the duplicates are a problem, which will require a distinct. Here's my advice:

with u as (
    select
          2 as [month]
        , name
        , id
        , detail
        , loc
        , age
    from tb_feb
    union all
    select
          3 as [month]
        , name
        , id
        , detail
        , loc
        , age
    from tb_mar
)
select distinct
    name
  , id
  , loc
  , last_value(age) over (partition by name,id,loc order by [month] rows between current row and unbounded following) as age
  , last_value(detail) over (partition by name,id,loc order by [month] rows between current row and unbounded following) as detail
from u

Of course you need to adapt the select in case you have more columns.