Home > Back-end >  Select based on unique pair of values
Select based on unique pair of values

Time:07-21

i have a problem with a query that i must do in order to filter the data from a table. Suppose to have the following table:

id col1 col2
1 12345 6789
2 6789 12345
3 3333 4213
4 12345 6789
5 6789 12345
6 1111 2222
7 12345 6789

if i execute the following query:

select * from  myTable
where (col1,col2) 
in(select col1, col2 from myTable group by col1,col2
having count(*) >= 4 and count(*) <= 100000 order by count(*) desc)

i would miss the unique pair "12345-6789" since it appear like that

count col1 col2
3 12345 6789
2 6789 12345

i've tried to search for a solution and i've found the following query:

SELECT CONCAT(LEAST(col1, col2), ',', GREATEST(col1, col2)) pair, 
       COUNT(*) count
FROM myTable
having count(*) >= 4 and count(*) <= 100000
GROUP BY pair

the problem with this query is that the outer select won't work cause the output is like:

count pair
5 12345,6789

So my question is: is there a way to write the following query

select * from myTable where (col1,col2) in ( unique pair of values from col1 e col2 that are counted independetly from the order in which they appear)

so that i can have the following output:

id col1 col2
1 12345 6789
2 6789 12345
4 12345 6789
5 6789 12345
7 12345 6789

EDIT: the only thing i can think of is to split the process and get the first part of the result using the LEAST/GREATEST query, then in my code split the values and build 2 query (or 1 with a union)

CodePudding user response:

You can use the functions LEAST() and GREATEST() in the WHERE clause to create sorted pairs to compare with the pairs returned by the subquery:

SELECT *
FROM myTable
WHERE (LEAST(col1, col2), GREATEST(col1, col2)) IN (
  SELECT LEAST(col1, col2) c1, GREATEST(col1, col2) c2
  FROM myTable
  GROUP BY c1, c2
  HAVING COUNT(*) >= 4 and COUNT(*) <= 100000
);

This works in MariaDB and in MySql 5.5 and 5.6, but surprisingly does not work in MySql 5.7 (should be an unnoticed bug?).

If you change to a join instead of the operator IN then it works fine both in MySql and MariaDB:

SELECT t1.*
FROM myTable t1 
INNER JOIN (
  SELECT LEAST(col1, col2) c1, GREATEST(col1, col2) c2
  FROM myTable
  GROUP BY c1, c2
  HAVING COUNT(*) >= 4 and COUNT(*) <= 100000
) t2 ON (t2.c1, t2.c2) = (LEAST(col1, col2), GREATEST(col1, col2));

See the demo.

  • Related