I have a DataFrame:
COL1 COL2
1 1
3 1
1 3
I need to sort by COL1 COL2.
key=lambda col: f(col) argument-function of sort_values(...) lets you sort by a changed column but in the described case I need to sort on the basis of 2 columns. So, it would be nice if there were an opportunity to provide a key argument-function for 2 or more columns but I don't know whether such a one exists.
So, how can I sort its rows by sum COL1 COL2?
Thank you for your time!
CodePudding user response:
This does the trick:
data = {"Column 1": [1, 3, 1], "Column 2": [1, 2, 3]}
df = pd.DataFrame(data)
sorted_indices = (df["Column 1"] df["Column 2"]).sort_values().index
df.loc[sorted_indices, :]
I just created a series that has the sum of both the columns, sorted it, got the sorted indices, and printed those indices out for the dataframe.
(I changed the data a little so you can see the sorting in action. Using the data you provided, you wouldn't have been able to see the sorted data as it would have been the same as the original one.)
CodePudding user response:
Assuming a unique index, you can also conveniently use the key
parameter of sort_values
to pass a callable to apply to the by
column. Here we can add
the other column:
df.sort_values(by='COL1', key=df['COL2'].add)
We can even generalize to any number of columns using sort_index
:
df.sort_index(key=df.sum(1).get)
Output:
COL1 COL2
0 1 1
2 1 3
1 3 2
Used input:
data = {"COL1": [1, 3, 1], "COL2": [1, 2, 3]}
df = pd.DataFrame(data)