Home > other >  Create n groups from a csv file
Create n groups from a csv file

Time:10-21

I have a csv file with the following structure

Name Position
Albert Producer
John Director
Anna Volunteer
Gaston Tech
other 51 names other 51 positions

I need to create random groups where each group has one member from each position. There are only four positions across the dataset. The final result would be something like

Name Position Group
Albert Producer Group 1
John Director Group 1
Anna Volunteer Group 1
Gaston Tech Group 1
Martin Producer Group 2
Vanessa Director Group 2
Charles Volunteer Group 2
Milan Tech Group 2

What is the most efficient way to create these groups?

Thank you in advance!

CodePudding user response:

You can use two groupby. One to shuffle the Positions using sample(frac=1), and the other to assign the groups:

(df.groupby('Position').pipe(lambda d: d.sample(frac=1))
   .assign(Group=lambda d: d.groupby('Position').cumcount().add(1))
   .sort_values(by=['Group', 'Position']) # optional
)

example output:

      Name   Position  Group
1     John   Director      1
4   Martin   Producer      1
3   Gaston       Tech      1
6  Charles  Volunteer      1
5  Vanessa   Director      2
0   Albert   Producer      2
7    Milan       Tech      2
2     Anna  Volunteer      2

  • Related