I need to delete duplicates from a table. I've tried to do this following exactly the code from this website as well as this

    WITH cte (Id, 
    Proname, 
    Cityname, 
    Companyname,
    ItemsNo,
    row_num) 
    AS (SELECT 
    Id, 
    Proname, 
    Cityname, 
    Companyname,
    ItemsNo,
    ROW_NUMBER() OVER 
    (PARTITION BY 
    Id, 
    Proname, 
    Cityname, 
    Companyname
    ORDER BY 
    Id, 
    Proname, 
    Cityname, 
    Companyname) AS row_num
    FROM dba.tabdupes)
    DELETE FROM cte
    WHERE row_num > 1;

but every time I'm getting a syntax error in Teradata: [3707]syntax error expected something like a 'SELECT' keyword or '(' or a 'TRANSACTION TIME' keyword or a 'VALIDRIME' keyword between ')' and the 'DELETE' keyword.

I've tried multiple solutions but can't get what is wrong here.

CodePudding user response：

There is no way to DELETE "all but one" row from a set of duplicate rows in a table. You can eliminate duplicates within the result set of a SELECT, and INSERT the result into a second table. Teradata's special "SET" tables will also quietly remove entirely duplicate rows when you do an INSERT ... SELECT though whether that performs well depends on how unique the primary index values are in the target table.

CodePudding user response：

Teradata is not SQL server.

You could try following approach

But before you run the cede

make a Backup of your database, or make da test database to run tessif the query does make that what you want

DELETE dba.tabdupes
 FROM (
    SELECT 
    Id, Proname, Cityname,  Companyname, ItemsNo
    , ROW_NUMBER() OVER 
    (PARTITION BY Id, Proname, Cityname, Companyname ORDER BY Id, Proname, Cityname, Companyname) AS row_num
    FROM dba.tabdupes) AS d, edba.tabdupes t
    WHERE t.id = d.id
 AND   row_num = 1