Let's say I have these two tables. Where I insert employees to employee
table coming from the staging
table.
staging table:
id | employee_id | name |
---|---|---|
1 | 12 | Paul |
2 | 13 | Kyle |
employee table
id | employee_id | name |
---|---|---|
5 | 4 | Will |
6 | 13 | Kyle |
Now, on the employee table let's say I'd like to copy what's on my staging table currently, using the INSERT SELECT INTO
statement, Paul will be inserted but I don't want Kyle to be inserted since he's on the employee table already(employee.employee_id is the defining column).
I know this could be done by just setting a unique or primary key, on employee_id
and just using the statement ON DUPLICATE KEY UPDATE
then do nothing by just setting them back to their original values.
I'm new to SQL, and I'm stuck with the solution setting a UNIQUE
key and ON DUPLICATE KEY UPDATE
statement, but I'd like to know how to do this without that solution I mentioned?
CodePudding user response:
First of all, you should keep in mind that the decision whether to create unique or primary keys or not does not depend on how to create insert statements or such. It's a matter of what your table should do and what not. In order to achieve your goal, you can add a where to your insert statement which excludes the already existing entries, as example:
INSERT INTO employees (id, employee_id, name)
SELECT id, employee_id, name
FROM staging
WHERE employee_id NOT IN (SELECT employee_id FROM employees)
CodePudding user response:
Break the problem down into its constituent parts
- get a list of employees who are in the staging table but not in the target table
- insert those records only
Part 1 can be achieved with a Left Join and Where condition that the target table is null. Wrap that up as a CTE and then use it in 2)