Home > Blockchain >  Best way to combine two tables, remove duplicates, but keep all other non-duplicate values in SQL
Best way to combine two tables, remove duplicates, but keep all other non-duplicate values in SQL

Time:12-02

I am looking for the best way to combine two tables in a way that will remove duplicate records based on email with a priority of replacing any duplicates with the values in "Table 2", I have considered full outer join and UNION ALL but Union all will be too large as each table has several 1000 columns. I want to create this combination table as my full reference table and save as a view so I can reference it without always adding a union or something to that effect in my already complex statements. From my understanding, a full outer join will not necessarily remove duplicates. I want to:

a. Create table with ALL columns from both tables (fields that don't apply to records in one table will just have null values)

b. Remove duplicate records from this master table based on email field but only remove the table 1 records and keep the table 2 duplicates as they have the information that I want

c. A left-join will not work as both tables have unique records that I want to retain and I would like all 1000 columns to be retained from each table

I don't know how feasible this even is but thank you so much for any answers!

CodePudding user response:

try using a FULL OUTER JOIN between the two tables and then a COALESCE function on each resultset column to determine from which table/column the resultset column is populated

CodePudding user response:

If I understand your question correctly you want to join two large tables with thousands of columns that (hopefully) are the same between the two tables using the email column as the join condition and replacing duplicate records between the two tables with the records from Table 2.
I had to do something similar a few days ago so maybe you can modify my query for your purposes:

WITH only_in_table_1 AS(
SELECT *
FROM table_1 A
WHERE NOT EXISTS
(SELECT * FROM table_2 B WHERE B.email_field = A.email_field))
SELECT * FROM table_2
UNION ALL
SELECT * FROM only_in_table_1
  • Related