Is it possible to query MySQL to get only fields that contain duplicate/repeating strings?-CodePudding

What I mean is, I have table with a "list" column. The data that goes into the "list" is related to addresses, so I sometimes get repeated zip codes for one record in that field.

For example, "12345,12345,12345,12456".

I want to know if it's possible to construct a query that would find the records that have an unknown string that duplicates within the field, such that I would get the records like "12345,12345,12345,12456", but not ones like "12345,45678,09876".

I hope that makes sense.

CodePudding user response：

Yes, it is possible. You need to use a numbers table to convert your delimited string into rows, then use group by to find duplicates, e.g.

CREATE TABLE T (ID INT, List VARCHAR(100));
INSERT INTO T (ID, List)
VALUES (1, '12345,12345,12345,12456'), (2, '12345,45678,09876');

SELECT
  T.ID,
  SUBSTRING_INDEX(SUBSTRING_INDEX(T.list, ',', n.Number), ',', -1) AS ListItem
FROM T
    INNER JOIN
    (   SELECT 1 AS Number UNION ALL
        SELECT 2 UNION ALL 
        SELECT 3 UNION ALL
        SELECT 4 UNION ALL 
        SELECT 5
    ) AS n 
        ON CHAR_LENGTH(T.list)-CHAR_LENGTH(REPLACE(T.list, ',', ''))>=n.Number-1
GROUP BY T.ID, ListItem
HAVING COUNT(*) > 1;

If you don't have a numbers table you can create one in a derived query as I have above with UNION ALL

Example on DB Fiddle

With that being said, this is almost certainly not the right way to store your data, you should instead use a child table, e.g.

CREATE TABLE ListItems
(
      MainTableId INT NOT NULL, --Foreign Key to your current table
      ItemName VARCHAR(10) NOT NULL -- Or whatever data type you need
);

Then your query is much more simple:

SELECT T.ID, li.ItemName
FROM    T
        INNER JOIN ListItems AS li
            ON li.MainTableId = T.ID
GROUP BY T.ID, li.ItemName
HAVING COUNT(*) > 1;

If you need to recreate your original format, this is easily done with GROUP_CONCAT():

SELECT T.ID, 
        GROUP_CONCAT(li.ItemName) AS List
FROM    T
        INNER JOIN ListItems AS li
            ON li.MainTableId = T.ID
GROUP BY T.ID;

Example on DB Fiddle

CodePudding user response：

This query list out duplicate record in table.

SELECT  PK,cs.Value,count(*) 
FROM your_table
CROSS APPLY STRING_SPLIT (your_column , ',') cs
GROUP BY PK,cs.Value 
HAVING COUNT(*) >1;

CROSS APPLY STRING_SPLIT used for split the column value with mentioned value and return as row

CodePudding user response：

I am still unclear what your desired result is based on your question however if it is simply to get all rows where there is a duplicate entry in column list you could do the following:

SELECT * FROM TABLE
WHERE COLUMN IN
   (SELECT COLUMN FROM TABLE 
   having count(*) >1)