I have the following df:
ID Foods
0 100 Apple
1 101 Apple | Orange | Grape
2 102 Apples & Peanut Butter | Peanuts | Coffee
3 103 Grapes and Apples | Melon | Butter
4 104 Milk, Cookies | Cake | Cupcake
I want to be organize every cell by alphabetical, separated by |
so the result would be:
ID Foods
0 100 Apple
1 101 Apple | Grape | Orange
2 102 Apples & Peanut Butter | Coffee | Peanuts
3 103 Butter | Grapes and Apples | Melon
4 104 Cake | Cupcake | Milk, Cookies
Sample df:
df = {
'ID': {0: '100', 1: '101', 2: '102', 3: '103', 4: '104'},
'Foods': {0: 'Apple',
1: 'Apple | Orange | Grape',
2: 'Apples & Peanut Butter | Peanuts | Coffee ',
3: 'Grapes and Apples | Melon | Butter',
4: 'Milk, Cookies | Cake | Cupcake'}}
pd.DataFrame.from_dict(x)
CodePudding user response:
Just use .str.split
to split by |
( and whitespace, left or right), call sorted
(which returns a new sorted list) on each split list, then join them back together:
df['Foods'] = df['Foods'].str.split('\s*\|\s*').apply(sorted).str.join(' | ')
Output:
>>> df
ID Foods
0 100 Apple
1 101 Apple | Grape | Orange
2 102 Apples & Peanut Butter | Coffee | Peanuts
3 103 Butter | Grapes and Apples | Melon
4 104 Cake | Cupcake | Milk, Cookies
CodePudding user response:
Split the foods columns into a list
df['Foods'] = df.Foods.str.split('\s*|\s*')
Explode the list, sort the values and bring it back to the old format
df
.explode('Foods')
.sort_values(by=['Foods'], ascending=True)
.groupby('ID')
.agg(' | '.join)