Home > Enterprise >  How to delete duplicate rows if they are odd else keep one
How to delete duplicate rows if they are odd else keep one

Time:12-16

Its a card like game. my table is

Number Symbol Player
1 C F
1 S F
1 D F
1 H F
2 S F
2 C F
2 D F
3 H F
2 H S
3 S S

I am trying to remove all the 1 cards from Player F because he have an even number of those cards and he have collect all the cards. And i want to remove only 2 rows of the 2 cards from Player F because he have an odd number of those and the last 2 its on the S player

I am trying to create a procedure for this i only manage to keep one row of the player with this

PROCEDURE `deleteDupl`()
BEGIN
    DELETE c1 FROM cards c1, cards c2 WHERE c1.Symbol > c2.Symbol AND c1.Number = c2.Number AND c1.Player = c2.Player;
 
END

--edit the point of the game is to pick cards from your opponent and once you have 2 same cards(the number of card not the symbol) you drop them (no matter what symbol just random drop 2 of the same numbers)

but in the start of the game you might get more than 2 of the same cards like the F player have all the aces so he have to drop them all

or like the F player have three times the 2 card he must drop two cards (no matter what symbol) until he pick the card with number 2 from his opponent

CodePudding user response:

You have not said which MySQL version you are running. This stored procedure example works on MySQL 5.6. It runs a simple GROUP BY query to get all Number, Player groups with more than 1 card. It then loops over the cursor and runs a delete for each row returned.

CREATE PROCEDURE `sp_DeletePairs`()
BEGIN
  DECLARE done      BOOLEAN DEFAULT FALSE;

  DECLARE _number   TINYINT UNSIGNED;
  DECLARE _player   CHAR(20);
  DECLARE _count    TINYINT UNSIGNED;

  DECLARE `cur` CURSOR FOR
    SELECT `Number`, `Player`, COUNT(*) AS `num`
    FROM `cards`
    GROUP BY `Number`, `Player`
    HAVING `num` > 1;

  DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;

  OPEN cur;

  read_loop: LOOP

    FETCH cur INTO _number, _player, _count;
    IF done THEN
      LEAVE read_loop;
    END IF;

    CASE
      WHEN _count IN (2, 3) THEN
        DELETE FROM `cards` WHERE `Number` = _number AND  `Player` = _player LIMIT 2;
      WHEN _count = 4 THEN
        DELETE FROM `cards` WHERE `Number` = _number AND  `Player` = _player LIMIT 4;
    END CASE;
  END LOOP;

  CLOSE cur;
END

Obviously, you can wrap the following DELETE query examples in stored procedures if you so desire.

If you are on MySQL 8.0 or later you can do it using window functions -

WITH `stats` AS (
    SELECT `Number`, `Symbol`, `Player`,
        ROW_NUMBER() OVER (PARTITION BY `Number`, `Player` ORDER BY `Player`, `Number`, `Symbol`) AS `seq`,
        COUNT(*) OVER (PARTITION BY `Number`, `Player`) AS `count_numbers`
    FROM cards
)
DELETE `c`
FROM `cards` `c`
INNER JOIN `stats` `s`
    ON `c`.`Number` = `s`.`Number`
    AND `c`.`Symbol` = `s`.`Symbol`
    AND `c`.`Player` = `s`.`Player`
WHERE `s`.`count_numbers` = 4
OR (`s`.`count_numbers` IN (2, 3) AND `s`.`seq` IN (1, 2));

Within the CTE, the ROW_NUMBER() is giving us a cumulative count within the Number, Player PARTITION. The COUNT(*) is giving us the total within the Number, Player PARTITION. We can then join between stats (the CTE) and cards on all three of the original columns. Finally, we use the WHERE clause to decide which cards to remove.

A similar approach can be taken in MySQL < 8.0, using variables for the sequence and a join to another derived table to get the count per group -

DELETE `c`
FROM `cards` `c`
INNER JOIN (
    SELECT
        `c`.`Number`,
        `c`.`Symbol`,
        `c`.`Player`,
        IF(@prev_number = `c`.`Number` AND @prev_player = `c`.`Player`, @row := @row   1, @row := 1) AS `seq`,
        `counts`.`count_numbers`,
        @prev_number := `c`.`Number`,
        @prev_player := `c`.`Player`
    FROM `cards` `c`
    JOIN (SELECT @row := 0, @prev_number := 0, @prev_player:=0) t
    INNER JOIN ( SELECT `Player`, `Number`, COUNT(*) AS `count_numbers` FROM `cards` GROUP BY `Player`, `Number`) AS `counts`
        ON `c`.`Player` = `counts`.`Player`
        AND `c`.`Number` = `counts`.`Number`
    ORDER BY `c`.`Player`, `c`.`Number`
) `s`
    ON `c`.`Number` = `s`.`Number`
    AND `c`.`Symbol` = `s`.`Symbol`
    AND `c`.`Player` = `s`.`Player`
WHERE `s`.`count_numbers` = 4
OR (`s`.`count_numbers` IN (2, 3) AND `s`.`seq` IN (1, 2));

I am definitely not suggesting using this last example, at least not in a production environment. I just included as it might be interesting to someone.

  • Related