I have a table (Table 1) which is a list of employees and data points about them. The data that populated Table 1 is regularly changed and added to in its source system and then is loaded into another table (Table 2) via a CSV.
If data changes for a certain employee, I want to update one column in that employees record in Table 1 to mark that it has since changed and then add a new row to the table with that employees changed data along with any new employees from Table 2 that didn't already exist in Table 1.
Example:
Table1:
Status | ID | Name | Department | Title
Active 767 John Tech Analyst
Active 789 Alex Tech Courier
Table2:
Status | ID | Name | Department | Title
Active 767 John Tech Director -- the title changed for this employee
Active 789 Alex Tech Courier
Desired Output of Table1:
Status | ID | Name | Department | Title
Active 767 John Tech Director -- Updated Employee data from Table2
Changed 767 John Tech Analyst -- Old Employee Data from Table1
Active 789 Alex Tech Courier
With the query below, I am attempting to update Table1 via an outer join which in theory should append any rows that do not meet the where criteria to the table as a separate row and join on all rows that do meet it. However, the employees with changed data are not being added to the table, only the status is being updated. I want to add employees who have changed their any of their departments, titles, etc. and any new employees that don't already exist in Table1
How can I ensure employees whose data have changed are added as new rows while joining employees that have not.
Query:
UPDATE [Database].[dbo].[CBC_legacy]
SET
[Position Status] =[CBC].[Position Status]
,[Employee ID] = [CBC].[Employee ID]
,[Full Legal Name] = [CBC].[Full Legal Name]
,[Hire/Rehire Date] = [CBC].[Hire/Rehire Date]
,[Annual Salary] = [CBC].[Annual Salary]
,[Job Title] = [CBC].[Job Title]
,[Kamsa Job Code] = [CBC].[Kamsa Job Code]
,[Home Department] = [CBC].[Home Department]
,[Worked In Country] = [CBC].[Worked In Country]
,[Budget ID] = [CBC].[Budget ID]
,[Work Location] = [CBC].[Work Location]
,[Regular Pay Currency] = [CBC].[Regular Pay Currency]
FROM [Database].[dbo].[CBC_legacy] AS CBCL
FULL OUTER JOIN [Database].[dbo].[CBC_data] CBC
ON CBCL.[Employee ID] = CBC.[Employee ID]
WHERE [CBC].[Employee ID] = [CBCL].[Employee ID]
AND [CBC].[Job Title] = [CBCL].[Job Title]
AND [CBC].[Home Department] = [CBCL].[Home Department]
AND [CBC].[Annual Salary] = [CBCL].[Annual Salary]
SELECT * FROM [Database].[dbo].[CBC_legacy] ORDER BY [Full Legal Name]
CodePudding user response:
You need 2 queries, one for the update and one for the insert.
I assume that only Department and Title can change in this example, if more columns can change than just add them to the where clause
update t1
set t1.Status = 'Changed'
from table1 t1
inner join table2 t2 on t1.id = t2.id
where (t2.Department <> t1.Department
or
t2.Title <> t1.Title
)
insert into table1
select t2.Status, t2.Id, t2.Name, t2.Department, t2.Title
from table2 t2
left join table1 t1 on t2.id = t1.id
and ( t2.Department <> t1.Department
or
t2.Title <> t1.Title
)
where not exists ( select 1
from table1 t1
where t1.id = t2.id
and t1.status = t2.status
and t1.name = t2.name
and t1.department = t2.department
)
select * from table1 order by id, status
Click on this DBFiddle to see it working
The result is
Status | id | name | department | title |
---|---|---|---|---|
Active | 767 | John | Tech | Director |
Changed | 767 | John | Tech | Analist |
Active | 787 | Alex | Tech | Courier |
Active | 800 | Picard | Tech | Captain |