I am trying to build a dimension table which will identify current and historical data, by using the fields 'IsActive' and 'EffectiveDate'.
IsActive of 1 = Active.
EffectiveDate = Date the record was ingested.
Scenario: I have an existing record with the current view of the employee, however if any information about the employee changes, instead of updating the existing record, I would like to create a new line item with the updated record becoming active and the previous record becoming inactive.
ID | Name | Surname | Age | IsActive | EffectiveDate |
---|---|---|---|---|---|
1 | John | Doe | 54 | 1 | 2021-01-01 |
When a change to the employee data is done, I would like to update the table as follows:
ID | Name | Surname | Age | IsActive | EffectiveDate |
---|---|---|---|---|---|
1 | John | Doe | 54 | 0 | 2021-01-01 |
2 | John | Doe | 64 | 1 | 2021-06-25 |
I am using the combination of 'Name', 'Surname', and 'Age' to identify unique records. The reason I am using these three fields is because there is no other information provided to uniquely identify an employee.
I would really appreciate any assistance.
CodePudding user response:
You could do this yourself, but I would give
- Source data:
Code:
Create procedure sp_employee AS SET NOCOUNT ON; BEGIN --update Active flag of existing data MERGE dim_employee AS Target USING employee_data AS Source ON Source.Name = Target.Name and Source.Surname = Target.Surname WHEN MATCHED THEN UPDATE SET Target.IsActive = 0; --Insert new data (assuming the table has only changed data set) Insert into dim_employee select Name, Surname, Age, IsActive, EffectiveDate from employee_data; End GO
If you have old records in the source table along with the new change set, you can use the below code to insert only the change dataset.
Insert into dim_employee select Name, Surname, Age, IsActive, EffectiveDate from employee_data except select distinct a.Name, a.Surname, a.Age, a.IsActive, a.EffectiveDate from employee_data a inner join dim_employee b on a.Name = b.Name and a.Surname = b.Surname and a.Age = b.Age;
Execute stored procedure
Exec sp_employee;
After executing stored procedure: Here IsActive column data for ‘John Doe’ got updated to 0 and a new record for ‘John Doe’ got inserted from employee_data table.