Is there a way to sort values without having to specify all values in the list? I just want to, for example, sort by economics and then library. The remaining rows should be the same order as the original df.
order = ["economics","library"]
df cat
0 library
1 economics
2 science
3 np.NaN
Expected Output:
1 economics
13 economics
0 library
...
df.sort_values("cat", key=lambda column = column.map(lambda: x:order.index(x)))
CodePudding user response:
Example
data = ['library', 'economics', 'science', 'aaa', 'bbb']
df = pd.DataFrame(data, columns=['cat'])
df
cat
0 library
1 economics
2 science
3 aaa
4 bbb
Code
order = ["economics","library"]
out = df.sort_values('cat', key=lambda x: pd.Categorical(x, categories=order, ordered=True))
out
cat
1 economics
0 library
2 science
3 aaa
4 bbb
CodePudding user response:
You can generate dictionary with enumerate
for mapping values in cat
column:
print (df)
cat
0 library
1 economics
2 science
3 new
4 economics
order = ["economics","library"]
print ({v:k for k, v in enumerate(order)})
{'economics': 0, 'library': 1}
df = df.sort_values("cat", key=lambda x: x.map({v:k for k, v in enumerate(order)}))
print (df)
cat
1 economics
4 economics
0 library
2 science
3 new