I'm trying to anonymize a single column in a database through data shuffle.
I created this query but when I run it it update the column FirstName
always with the same name:
BEGIN TRAN;
;WITH TCE_Shuffled
AS (
SELECT ROW_NUMBER() OVER (
ORDER BY [BusinessEntityID]
) AS ORIG_ROWNUM
,ROW_NUMBER() OVER (
ORDER BY NEWID()
) AS NEW_ROWNUM
,*
FROM [AdventureWorks2014].[Person].[Person]
)
UPDATE [AdventureWorks2014].[Person].[Person]
SET FirstName = t2.FirstName
FROM TCE_Shuffled t1
JOIN TCE_Shuffled t2 ON t1.ORIG_ROWNUM = t2.NEW_ROWNUM
SELECT *
FROM [AdventureWorks2014].[Person].[Person]
ROLLBACK TRAN
SELECT *
FROM [AdventureWorks2014].[Person].[Person]
Here you can see that the column FirstName
as the same value:
BusinessEntityID | PersonType | NameStyle | Title | FirstName | MiddleName | LastName |
---|---|---|---|---|---|---|
13353 | IN | 0 | NULL | Erik | M | Griffin |
13354 | IN | 0 | NULL | Erik | NULL | Diaz |
13355 | IN | 0 | NULL | Erik | J | Xu |
13356 | IN | 0 | NULL | Erik | NULL | Sun |
13357 | IN | 0 | NULL | Erik | NULL | Zhu |
13358 | IN | 0 | NULL | Erik | R | Gao |
13359 | IN | 0 | NULL | Erik | NULL | Hayes |
13360 | IN | 0 | NULL | Erik | NULL | Ma |
13361 | IN | 0 | NULL | Erik | G | Price |
13362 | IN | 0 | NULL | Erik | M | Liang |
What am I doing wrong?
CodePudding user response:
This is the correct syntax for the UPDATE
statement:
UPDATE t1
SET t1.FirstName = t2.FirstName
FROM TCE_Shuffled t1 JOIN TCE_Shuffled t2
ON t1.ORIG_ROWNUM = t2.NEW_ROWNUM;