Hi I have a pandas dataframe with multiple columns in that I have a object type column in which values are seperated by ','. I want to sort the column's values in alphabetical order.
The dataframe I have :
ID things
1 pen,car,robot
2 lamp,jug,phone
3 switch,pen,book
The dataframe I want:
ID things
1 car,pen,robot
2 jug,lamp,phone
3 book,pen,switch
Thanks in advance.
CodePudding user response:
Split, sort and then join:
df['things'] = df.things.apply(lambda x: ','.join(sorted(x.split(','))))
df
ID things
0 1 car,pen,robot
1 2 jug,lamp,phone
2 3 book,pen,switch
A little benchmark against .str
method:
timeit("df.things.apply(lambda x: ','.join(sorted(x.split(','))))", number=10000, globals=globals())
4.744999999995343
timeit("df['things'].str.split(',').map(np.sort).str.join(',')", number=10000, globals=globals())
14.570999999996275
CodePudding user response:
You can split with str.split()
, sort with .map()
numpy.sort()
and join back the texts with str.join()
, as follows:
df['things'] = df['things'].str.split(',').map(np.sort).str.join(',')
All functions used are fast vectorized Pandas/numpy functions.
Result:
print(df)
ID things
0 1 pen,car,robot
1 2 lamp,jug,phone
2 3 switch,pen,book