Home > Net >  How to Group By on a list of strings in pandas?
How to Group By on a list of strings in pandas?

Time:12-23

My Data

Column 1       Column 2                              Column 3
"Task 1"       ["emailofowner1","emailofowner2"]      John Doe
"Task 37"      ["emailofowner1","emailofowner2"]      John Doe

I have many such rows I want my output to be :

Column 1                 Column 2                            Column 3
["Task1","Task37"]       ["emailofowner1","emailofowner2"]     John Doe

CodePudding user response:

groupby requires hashable objects, which list aren't.

You can convert to tuple to use as a grouper:

out = (df
  .groupby(df['Column 2'].apply(tuple), as_index=False)
  .agg({'Column 1': list, 'Column 2': 'first', 'Column 3': 'first'})
)

Output:

            Column 1                        Column 2  Column 3
0  [Task 1, Task 37]  [emailofowner1, emailofowner2]  John Doe
  • Related