I want to randomly match elements of a table by some condition (1:1 and symmetric pairing, so if user x matches with user y, then user y matches with user x)
WITH tbl AS (
SELECT 1 AS user, 'A' AS condition
UNION ALL
SELECT 2 AS user, 'A' AS condition
UNION ALL
SELECT 3 AS user, 'A' AS condition
UNION ALL
SELECT 4 AS user, 'B' AS condition
UNION ALL
SELECT 5 AS user, 'B' AS condition
UNION ALL
SELECT 6 AS user, 'B' AS condition
UNION ALL
SELECT 7 AS user, 'B' AS condition
UNION ALL
SELECT 8 AS user, 'B' AS condition
UNION ALL
SELECT 9 AS user, 'B' AS condition
)
SELECT
user,
condition
FROM tbl
How can I generate a query to:
- randomly match all users in condition B with another user in condition B
- do the same with condition A, but leave one condition A user unmatched because there are an odd number of users with condition A
- identifies the unmatched user (1, 2, or 3) with a NULL match
Hypothetical result:
user | condition | match |
---|---|---|
1 | A | 3 |
2 | A | NULL |
3 | A | 1 |
4 | B | 7 |
5 | B | 9 |
6 | B | 8 |
7 | B | 4 |
8 | B | 6 |
9 | B | 5 |
CodePudding user response:
Consider below approach
create temp table temp as
select _0 as user, condition, _1 as match from (
select user, condition, div(offset, 2) grp, mod(offset, 2) pos
from (
select condition, array_agg(user order by rand()) users
from your_table
group by condition
), unnest(users) user with offset
)
pivot (any_value(user) for pos in (0,1));
select * from (
select * from temp union all
select match, condition, user from temp
)
where not user is null;
if applied to sample data in your question - output is
As you can see above solution requires scripting to be involved - so below is slightly refactored version that allows you to achive same result with just one "simple" query
select * from (
select if(grp < 0, _0, _1) as user, condition, if(grp < 0, _1, _0) as match from (
select user, condition, grp, mod(offset, 2) pos
from (
select condition, array_agg(user order by rand()) users
from your_table
group by condition
), unnest(users) user with offset, unnest([div(offset 2, 2), -1 * div(offset 2, 2)]) grp
)
pivot (any_value(user) for pos in (0,1))
)
where not user is null