I have two tables that have 3 fields with the same value and one relation field in the table1. Let me explain.
table1
------
id
column1
column2
column3
table2_id
table2
-------
id
column1
column2
column3
In table2; column1, column2 and column3 can have same value for different ids. For example:
table2
-------
1, 2, 3, 4
2, 2, 3, 4
3, 4, 5, 6
Ass you can see, only ids are different for the first two records. The others are the same. Also in table1:
table1
-------
1, 2, 3, 4, null
2, 4, 5, 6, null
I want to update table1's table2_id field(which are null in example) for only records in table1 that have one specific record for column1, column2, column3.
So I expect that:
table1
------
1, 2, 3, 4, null
2, 4, 5, 6, 3
first record will be still null because 2 different ids can be linked there. But for the second record, table2_id can only be '3'
How can I write the query for this update process?
CodePudding user response:
To account for possible null values you could use OUTER APPLY
update t1
set table2_id=t2_single.id
from table1 t1
outer apply (select max(t2.id)
from table2 t2
where t2.column1=t1.column1
and t2.column2=t1.column2
and t2.column3=t1.column3
group by t2.column1, t2.column2, t2.column3
having count(*)=1) t2_single(id);
CodePudding user response:
You can solve it with an additional table or CTE to identify duplicate or genuine records. A GROUP BY
or a join can be used to check the duplicates.
Please note that you also need to check for nulls in equations if the columns are nullable.
with GenuineData as (
select
min(t2.Id) id
from
Table2 t2
group by
t2.Column1,
t2.Column2,
t2.Column3
having
count(*) = 1
)
update t1 set
t1.Table2Id = t2.Id
from
Table1 t1
join Table2 t2 on t2.Column1 = t1.Column1 and t2.Column2 = t1.Column2 and t2.Column3 = t1.Column3
join GenuineData gd on gd.id = t2.Id
If the columns are nullable, you can use ISNULL() in the equations, or for better performance, an extended criteria for joining Table2 as below:
join Table2 t2 on
(t2.Column1 = t1.Column1 or (t1.Column1 is null and t2.Column1 is null))
and (t2.Column2 = t1.Column2 or (t1.Column2 is null and t2.Column2 is null))
and (t2.Column2 = t1.Column2 or (t1.Column2 is null and t2.Column2 is null))