Home > Blockchain >  Built group of two within a dataframe
Built group of two within a dataframe

Time:05-07

I would like to create a group of two within a dataframe based on the ID. So basically connect the text inside the second column using a space. Using groupby() will only join the entire text. I would like to set the group size per ID myself. If the group size does not add up, then none should be added from another group.

d = {'ID': [0,0,0,1,1,1,1], 'col2': ['Car','Tree','House','Cat','Dog','Cloud','Bottle']}
pd.DataFrame(data=d)

#Expected Output

     ID  col2
0     0  'Car Tree'
1     0  'House'
2     1  'Cat Dog'
3     1  'Cloud Bottle'

CodePudding user response:

Create a sequential counter with cumcount then divide this by 2 (desired group size) to create partitions, then group the dataframe by ID along with the partitions and aggregate col2 with join

i = df.groupby('ID').cumcount() // 2
df.groupby(['ID', i], as_index=False)['col2'].agg(' '.join)

   ID          col2
0   0      Car Tree
1   0         House
2   1       Cat Dog
3   1  Cloud Bottle
  • Related