I'm currently generating random data and I'm not sure how to do this part. I have two tables. I have one entity table and a group table. Both tables have a unique identifiers (guid). My goal is to create a table mappings that matches entities X amount of times to a single group.
For example, let's say I have 10k entities and 100 groups and I would like to create 1k mappings. I would need to create 10 mappings per group (1000/100). I could go ahead and create a while loop and just get 10 random entities, but it is not efficient and I would like to avoid that. I also thought about cross join and then reduce the result to 1k, but I would like each group to have the exact same number of mappings.
Is there any way to do this with a single query ?
Here is an example, say I have this Group table
and Entity Table
---------
| GroupId |
---------
| G1 |
| G2 |
| G3 |
---------
----------
| EntityId |
----------
| E1 |
| E2 |
| E3 |
| E4 |
| E5 |
| E6 |
| E7 |
| E8 |
| E9 |
| E10 |
----------
I would like to get 6 mappings (so each group needs 2 mappings). Here is an example of what I could get:
--------- ----------
| GroupId | EntityId |
--------- ----------
| G1 | E2 |
| G1 | E4 |
| G2 | E9 |
| G2 | E10 |
| G3 | E2 |
| G3 | E5 |
--------- ----------
CodePudding user response:
This sounds like a good use for NTILE.
Start by using ROW_NUMBER() on the Groups
table to generate sequential "group numbers" (sort by NewID() for a relatively random order):
GroupId | GroupNum :------ | -------: G3 | 1 G2 | 2 G1 | 3
Then use NTILE() to distribute the Entity
rows into the desired number of groups:
E1 | GroupNum :-- | -------: E2 | 1 E10 | 1 E3 | 1 E4 | 1 E1 | 2 E6 | 2 E5 | 2 E7 | 3 E8 | 3 E9 | 3
Finally JOIN the results together on "group number":
SQL:
DECLARE @NumOfGroups INT; SET @NumOfGroups = ( SELECT COUNT(*) FROM Groups ); SELECT e.[E1], g.GroupId FROM ( SELECT *, NTILE(@NumOfGroups) OVER(ORDER BY NEWID()) AS GroupNum FROM Entities ) AS e INNER JOIN ( SELECT *, ROW_NUMBER() OVER(ORDER BY NEWID()) AS GroupNum FROM Groups ) AS g ON g.GroupNum = e.GroupNum ORDER BY e.GroupNum, e.E1 GO
Results:
E1 | GroupId |
---|---|
E1 | G1 |
E3 | G1 |
E5 | G1 |
E8 | G1 |
E2 | G3 |
E7 | G3 |
E9 | G3 |
E10 | G2 |
E4 | G2 |
E6 | G2 |
db<>fiddle here
NB: Obviously this approach won't work if there are less Entity
rows than Groups
, but it doesn't sound like that's an issue here