I have a customer master table called Customer
. Everything in this table comes from a key/value style table called Cust_Property
.
The Cust_Property
table has 3 columns:
CustomerID, Property, Value
The Property
column may contain First_Name
with a value of John
. Sort of like a pre-pivoted table. I need to update the columns in Customer
table with the values of the associated Property
columns in Cust_Property
table.
Rules
- If there is a new
CustomerID
in theCust_Property
table, it will need to be added as a new row to theCustomer
table, as well as all appropriate properties. - All data in the
Customer
table will also be in theCust_Property
table. That means that not every record needs to be updated. Only those that have changed or are new. - Records are only added/updated in
Customer
table, not removed. - There are properties in the
Property
table where a corresponding column doesn't exist in theCustomer
table, so those are just ignored.
DDL
CREATE TABLE #Customer
(
Customerid int,
FirstName varchar(50),
LastName varchar(50),
Address1 varchar(100),
Address2 varchar(100),
Address3 varchar(100)
)
CREATE TABLE #Cust_Property
(
CustomerID int,
Property varchar(50),
Value varchar(50)
)
INSERT INTO #Customer (Customerid, FirstName, LastName, Address1, Address2, Address3)
VALUES(1, N'John', N'Smith', N'123 happy lane', NULL, NULL);
INSERT INTO #Customer (Customerid, FirstName, LastName, Address1, Address2, Address3)
VALUES(2, N'Dwight', N'Schrute', N'33 1st Ave', N'Apt 5', NULL);
INSERT INTO #Customer (Customerid, FirstName, LastName, Address1, Address2, Address3)
VALUES(3, NULL, NULL, NULL, NULL, NULL);
INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(3, N'First_Name', N'Michael');
INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(3, N'Last_Name', N'Scott');
INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(8, N'First_Name', N'Jim');
INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(8, N'Last_Name', N'Halpert');
INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(8, N'Address1', N'644 Scranton Rd');
INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(8, N'Nickname', N'Jimmy');
INSERT INTO #Cust_Property (CustomerID, Property, Value)
VALUES(1, N'First_Name', N'John');
Table screenshots:
Customer
: https://i.imgur.com/yePz1Di.pngCust_Property
: https://i.imgur.com/YwvZExx.png
Desired Customer
table end result:
Customer
3'sFirst_Name
andLast_Name
columns are updated- Add customer 8 to
Customer
table b/c it doesn't already exist - Update all of customer 8's properties except property
Nickname
because that doesn't exist in theCustomer
table - Ignore the
First_Name
property forCustomerID = 1
because it is the same in theCustomer
table, so no update needed.
My current method: first find and insert new CustomerID
s
INSERT INTO #Customer (CustomerID)
SELECT DISTINCT CustomerID
FROM #Cust_Property a
WHERE NOT EXISTS (SELECT * FROM #Customer x
WHERE a.CustomerID = x.CustomerID)
Then update properties
UPDATE #Customer
SET #Customer.FirstName = a.Value
FROM #Cust_Property a
WHERE #Customer.CustomerID = a.CustomerID
AND a.Property = 'First_Name'
UPDATE #Customer
SET #Customer.LastName = a.Value
FROM #Cust_Property a
WHERE #Customer.CustomerID = a.CustomerID
AND a.Property = 'Last_Name'
UPDATE #Customer
SET #Customer.Address1 = a.Value
FROM #Cust_Property a
WHERE #Customer.CustomerID = a.CustomerID
AND a.Property = 'Address1'
UPDATE #Customer
SET #Customer.Address2 = a.Value
FROM #Cust_Property a
WHERE #Customer.CustomerID = a.CustomerID
AND a.Property = 'Address2'
UPDATE #Customer
SET #Customer.Address3 = a.Value
FROM #Cust_Property a
WHERE #Customer.CustomerID = a.CustomerID
AND a.Property = 'Address3'
In my actual tables, there are hundreds of different properties in Cust_Properties
table, 40 columns in the Customer
table that need updating and ~2M customer records.
Is there a better way that separate update statements for each? At present, I can't use an ETL tool, though I could technically use Python if that would be more efficient/faster.
CodePudding user response:
maybe one update like -
UPDATE c
set c.[FirstName] = isnull(cp.[First_Name],c.[FirstName])
,c.[LastName] = isnull(cp.[Last_Name],c.[FirstName])
,c.[Address1] = isnull(cp.[Address1],c.[Address1])
FROM #Customer c
INNER JOIN (
SELECT *
FROM #Cust_Property tb
pivot(
max(value)
for Property in ( [First_Name],[Last_Name],[Address1],[Nickname])
)pv
)cp
on c.Customerid = cp.CustomerID