I need to delete duplicates from a table. I've tried to do this following exactly the code from this website as well as this
WITH cte (Id,
Proname,
Cityname,
Companyname,
ItemsNo,
row_num)
AS (SELECT
Id,
Proname,
Cityname,
Companyname,
ItemsNo,
ROW_NUMBER() OVER
(PARTITION BY
Id,
Proname,
Cityname,
Companyname
ORDER BY
Id,
Proname,
Cityname,
Companyname) AS row_num
FROM dba.tabdupes)
DELETE FROM cte
WHERE row_num > 1;
but every time I'm getting a syntax error in Teradata: [3707]syntax error expected something like a 'SELECT' keyword or '(' or a 'TRANSACTION TIME' keyword or a 'VALIDRIME' keyword between ')' and the 'DELETE' keyword.
I've tried multiple solutions but can't get what is wrong here.
CodePudding user response:
There is no way to DELETE
"all but one" row from a set of duplicate rows in a table. You can eliminate duplicates within the result set of a SELECT
, and INSERT
the result into a second table.
Teradata's special "SET" tables will also quietly remove entirely duplicate rows when you do an INSERT ... SELECT
though whether that performs well depends on how unique the primary index values are in the target table.
CodePudding user response:
Teradata is not SQL server.
You could try following approach
But before you run the cede
make a Backup of your database, or make da test database to run tessif the query does make that what you want
DELETE dba.tabdupes
FROM (
SELECT
Id, Proname, Cityname, Companyname, ItemsNo
, ROW_NUMBER() OVER
(PARTITION BY Id, Proname, Cityname, Companyname ORDER BY Id, Proname, Cityname, Companyname) AS row_num
FROM dba.tabdupes) AS d, edba.tabdupes t
WHERE t.id = d.id
AND row_num = 1