I am exporting client data from a database into Salesforce. I have two tables (Fiddle shown below)
Table: SALESFORCE_CONTACTS
user_id | start_at | end_at | company_id |
---|---|---|---|
37483 | 2021-12-12 | NULL | 88 |
23982 | 2020-03-02 | 2021-04-02 | 169 |
Table: DB_CONCATS
user_id | start_at | end_at | company_id |
---|---|---|---|
37483 | 2021-12-12 | NULL | 88 |
23982 | 2020-03-02 | 2021-04-02 | 169 |
Every so often a column in DB_CONCATS
will change and on the export (every 2 hours) the Salesforce data will change. Right now I am exporting every single contact every 2 hours.
But I only want to export the records that do not match the Salesforce table. The only way I can think of is to write something like this:
SELECT
DB_CONTACTS.*
FROM
DB_CONTACTS
LEFT JOIN
SALESFORCE_CONTACTS ON DB_CONTACTS.user_id = SALESFORCE_CONTACTS.DB_USER_ID
WHERE
DB_CONTACTS.user_id <> SALESFORCE_CONTACTS.db_user_id
OR DB_CONTACTS.created_at <> SALESFORCE_CONTACTS.db_created_at
OR DB_CONTACTS.disabled_at <> SALESFORCE_CONTACTS.db_disabled_at
OR DB_CONTACTS.company_id <> SALESFORCE_CONTACTS.db_company_id
Great! That produces no records because the two tables are exactly alike at the moment. Once the end_at
in DB_CONCATS
is filled out a record will appear in the query and then, on the next extract, it will transfer it over to SALESFORCE_CONTACTS
The trouble is that the sales team is constantly adding new fields to transfer data to. So every time that is done I would have to add a line to the query above and it would be trouble to maintain. Is there an easier way to compare the two tables without so many lines of code?
http://sqlfiddle.com/#!9/3ca536/9
Thank you
CodePudding user response:
compare the schema and build a string to build the sproc / alter table programatically if they change it constantly. If the schema has changed you might throw all away and re-build the table and re-populate it from scratch, and if the schema has NOT changed:
SQL discards rows if you have any comparison against a NULL value. so <> won't cut it for such scenarios, I suggest the following as long as USER_ID is never null and is UNIQUE in both tables:
declare @a table (user_id int, start_at datetime, end_at datetime, company_id int);
declare @b table (user_id int, start_at datetime, end_at datetime, company_id int)
insert into @a values(37483,'2021-12-12', NULL, 88)
insert into @a values(23982,'2020-03-02', '2021-04-02', 169)
insert into @b values(37483,'2021-12-12', NULL, 88)
insert into @b values(23982,'2020-03-02', '2021-04-02', 169)
update @b set start_at = null where user_id = 37483
SELECT
*
FROM
@a a
full join @b b on a.user_id = b.user_id
WHERE NOT EXISTS(
select a.start_at, a.end_at, a.company_id
intersect
select b.start_at, b.end_at, b.company_id
)
CodePudding user response:
In the subquery, you will find the records that matches both table, from there you will exclude the records from db_contacts (making sure that the record that does not matches comes from db_contacts):
-- create a salesforce_contacts table
CREATE TABLE SALESFORCE_CONTACTS (
user_id integer,
created_at date,
disabled_at date,
company_id integer
);
-- insert some values into salesforce_contacts
INSERT INTO SALESFORCE_CONTACTS
VALUES
(37483,'2021-12-12',NULL,88),
(23982,'2020-03-02',NULL,169),
(12345,'2022-01-01',NULL,178)
;
-- create a db_contacts table
CREATE TABLE DB_CONTACTS (
user_id integer,
created_at date,
disabled_at date,
company_id integer
);
-- insert some values into db_contacts
INSERT INTO DB_CONTACTS
VALUES
(37483,'2021-12-12',NULL,88),
(23982,'2020-03-02','2020-04-01',169)
;
-- your sql goes like this , i tested it on https://www.mycompiler.io/new/sql
SELECT *
FROM DB_CONTACTS AS d1
EXCEPT
SELECT *
FROM (
SELECT *
FROM DB_CONTACTS AS d2
INTERSECT
SELECT *
FROM SALESFORCE_CONTACTS AS s1
) as ref