Edited for clarity:
I have a dataframe in the following format
i col1 col2 col3
0 00:00:00,1 10 1.7
1 00:00:00,2 10 1.5
2 00:00:00,3 50 4.6
3 00:00:00,4 30 3.4
4 00:00:00,5 20 5.6
5 00:00:00,6 50 1.8
6 00:00:00,9 20 1.9
...
That I'm trying to sort like this
i col1 col2 col3
0 00:00:00,1 10 1.7
1 00:00:00,2 10 1.5
4 00:00:00,5 20 5.6
3 00:00:00,9 20 1.9
4 00:00:00,4 30 3.4
5 00:00:00,3 50 4.6
6 00:00:00,6 50 1.8
...
I've tried df = df.sort_values(by = ['col1', 'col2']
which only works on col1.
I understand that it may have something to do with the values being 'strings', but I can't seem to find a workaround for it.
CodePudding user response:
If need sort each column independently use Series.sort_values
in DataFrame.apply
:
c = ['col1','col2']
df[c] = df[c].apply(lambda x: x.sort_values().to_numpy())
#alternative
df[c] = df[c].apply(lambda x: x.sort_values().tolist())
print (df)
i col1 col2
0 0 00:00:00,1 10
1 1 00:00:01,5 20
2 2 00:00:10,0 30
3 3 00:01:00,1 40
4 5 01:00:00,0 50
CodePudding user response:
df.sort_values(by = ['col2', 'col1']
Gave the desired result