Home > Back-end >  Removing duplicates from a table: syntax MariaDb error
Removing duplicates from a table: syntax MariaDb error

Time:11-30

I'm using this query to remove duplicates from phoenix_tasks table:

CREATE TEMPORARY TABLE temp
SELECT   `piva`
FROM     phoenix_tasks
GROUP BY `piva`
HAVING   COUNT(*) > 1;

DELETE FROM phoenix_tasks WHERE `piva` IN (SELECT `piva` FROM temp) AND taskStateId=1 ;


DROP TEMPORARY TABLE temp;

It works correctly in phpMyAdmin

But when the query is executed by web I receive this error message:

CREATE  TEMPORARY TABLE temp 
SELECT  piva
    FROM  phoenix_tasks
    GROUP BY  piva
    HAVING  COUNT (*) > 1;
DELETE
     FROM  phoenix_tasks
    WHERE  piva IN (
        SELECT  piva
            FROM  temp
                   )
      AND  taskStateId=1: You have an error in your SQL syntax;
check the manual that corresponds to your MariaDB server version for the right syntax to use near '
DELETE
     FROM  phoenix_tasks
    WHERE  piva IN (
        SELECT  piva
            FROM  temp
                   )
      AND  taskStat' at line 7

I found at https://mariadb.com/kb/en/useful-mariadb-queries/ this:

Removing Duplicates

CREATE TABLE t (id INT, f1 VARCHAR(2));

INSERT INTO t VALUES (1,'a'), (2,'a'), (3,'b'), (4,'a');

SELECT * FROM t t1, t t2 WHERE t1.f1=t2.f1 AND t1.id<>t2.id AND t1.id=(
  SELECT MAX(id) FROM t tab WHERE tab.f1=t1.f1
);
DELETE FROM t WHERE id IN (
  SELECT t2.id FROM t t1, t t2 WHERE t1.f1=t2.f1 AND t1.id<>t2.id AND t1.id=(
    SELECT MAX(id) FROM t tab WHERE tab.f1=t1.f1
  )
);

It seems to be exactly the query I'm trying to run (except for the presence in my query of temporary and the different construction of the temp table)

I'm not able why it has errors.

CodePudding user response:

It seems that your "web execution tool" does not like multiple SQL statements to be run in a single call.

I would try running all three, but one at a time.

CodePudding user response:

(The stated Question is already Answered -- run the statements separately. I'm addressing the task that led to the Question.)

Another way to de-dup a table:

CREATE TABLE pt_new LIKE phoenix_tasks;
ALTER TABLE pt_new
    ADD UNIQUE(piva);  -- specify the combination of columns that determine uniqueness
INSERT IGNORE INTO pt_new
    SELECT * FROM piva;
RENAME TABLE phoenix_tasks TO pt_old,
             pt_new TO phoenix_tasks;
DROP TABLE pt_old;

But, as already mentioned, run the queries one at a time.

Note: Before the DROP, you have a chance to verify that the process worked correctly.

Notes:

  • If two rows have different values in the other columns, this method blindly picks one row first, then tosses all dups based on UNIQUE(piva). You may be able to control which dup wins by adding an ORDER BY onto the SELECT.
  • Do you currently have a PRIMARY KEY? Would it make sense to replace it with PRIMARY KEY (piva)?
  • By doing the steps in this order, other connections never see the table missing or partially filled. The INSERT is slow; the RENAME is fast and atomic.
  • Related