Home > Back-end >  Sort pandas DataFrame by linear combination of two columns without creating new column
Sort pandas DataFrame by linear combination of two columns without creating new column

Time:10-12

Is there an easy way to sort a DataFrame based on a linear combination of two columns without creating a new column for that value? Given

df = pd.DataFrame([[4,1],[2,3]], columns=list('AB'))
A B
0 4 1
1 2 3

I would want to sort df by a given linear combination of columns A and B (e.g. A*B). Calling sort_values with a key function does not work, because it applies the function to each column individually. Ideally, I would do something like:

df.sort_values(by=['A','B'], key=lambda a,b: a*b) # does not work

Right now I am creating an extra column sort like this and I am wondering whether that is necessary.

df['sort'] = df['A']*df['B']
df.sort_values(['sort'])

Thanks in advance.

CodePudding user response:

Use DataFrame.sort_index with multiplied Series and .get:

df1 = df.sort_index(key=(df.A*df.B).get)

Or Series.argsort with DataFrame.iloc:

df1 = df.iloc[(df.A*df.B).argsort()]
  • Related