What I mean is, I have table with a "list" column. The data that goes into the "list" is related to addresses, so I sometimes get repeated zip codes for one record in that field.
For example, "12345,12345,12345,12456".
I want to know if it's possible to construct a query that would find the records that have an unknown string that duplicates within the field, such that I would get the records like "12345,12345,12345,12456", but not ones like "12345,45678,09876".
I hope that makes sense.
CodePudding user response:
Yes, it is possible. You need to use a numbers table to convert your delimited string into rows, then use group by to find duplicates, e.g.
CREATE TABLE T (ID INT, List VARCHAR(100));
INSERT INTO T (ID, List)
VALUES (1, '12345,12345,12345,12456'), (2, '12345,45678,09876');
SELECT
T.ID,
SUBSTRING_INDEX(SUBSTRING_INDEX(T.list, ',', n.Number), ',', -1) AS ListItem
FROM T
INNER JOIN
( SELECT 1 AS Number UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5
) AS n
ON CHAR_LENGTH(T.list)-CHAR_LENGTH(REPLACE(T.list, ',', ''))>=n.Number-1
GROUP BY T.ID, ListItem
HAVING COUNT(*) > 1;
If you don't have a numbers table you can create one in a derived query as I have above with UNION ALL
With that being said, this is almost certainly not the right way to store your data, you should instead use a child table, e.g.
CREATE TABLE ListItems
(
MainTableId INT NOT NULL, --Foreign Key to your current table
ItemName VARCHAR(10) NOT NULL -- Or whatever data type you need
);
Then your query is much more simple:
SELECT T.ID, li.ItemName
FROM T
INNER JOIN ListItems AS li
ON li.MainTableId = T.ID
GROUP BY T.ID, li.ItemName
HAVING COUNT(*) > 1;
If you need to recreate your original format, this is easily done with GROUP_CONCAT()
:
SELECT T.ID,
GROUP_CONCAT(li.ItemName) AS List
FROM T
INNER JOIN ListItems AS li
ON li.MainTableId = T.ID
GROUP BY T.ID;
CodePudding user response:
This query list out duplicate record in table.
SELECT PK,cs.Value,count(*)
FROM your_table
CROSS APPLY STRING_SPLIT (your_column , ',') cs
GROUP BY PK,cs.Value
HAVING COUNT(*) >1;
CROSS APPLY STRING_SPLIT used for split the column value with mentioned value and return as row
CodePudding user response:
I am still unclear what your desired result is based on your question however if it is simply to get all rows where there is a duplicate entry in column list you could do the following:
SELECT * FROM TABLE
WHERE COLUMN IN
(SELECT COLUMN FROM TABLE
having count(*) >1)