Home > Software design >  How to sort ascending row-wise in Pandas Dataframe
How to sort ascending row-wise in Pandas Dataframe

Time:04-29

This may sound silly, but I just can't seem to figure it out. I have a Pandas dataframe like this:

    N1  N2  N3  N4  N5
0   48  20  45  21  12
1   32  16  29  41  36
2   41  42  34  13  9
3   39  37  4   7   33
4   32  3   1   39  21
... ... ... ... ... ...
1313    1   5   27  36  42
1314    18  20  35  38  48
1315    12  34  37  38  42
1316    18  23  37  41  42
1317    2   10  18  34  35

and I want to sort each row so that the row is re-arranged from min to max. I don't want the column labels to change. ie it looks like this:

    N1  N2  N3  N4  N5
0   48  45  21  20  12
1   41  32  36  29  16
2   42  41  34  13  9

I've tried a for loop with iloc, running through the index,one row at a time, applying sort_values, but it doesn't work. Any help?

CodePudding user response:

You can sorting rows by numpy.sort, swap ordering for descending order by [:, ::-1] and pass to DataFrame constructor if performance is important:

df = pd.DataFrame(np.sort(df, axis=1)[:, ::-1], 
                  columns=df.columns, 
                  index=df.index)
print (df)
      N1  N2  N3  N4  N5
0     48  45  21  20  12
1     41  36  32  29  16
2     42  41  34  13   9
3     39  37  33   7   4
4     39  32  21   3   1
1313  42  36  27   5   1
1314  48  38  35  20  18
1315  42  38  37  34  12
1316  42  41  37  23  18
1317  35  34  18  10   2

A bit worse performance if assign back:

df[:] = np.sort(df, axis=1)[:, ::-1]

Performance:

#10k rows
df = pd.concat([df] * 1000, ignore_index=True)

#Ynjxsjmh sol
In [200]: %timeit df.apply(lambda row: list(reversed(sorted(row))), axis=1, result_type='expand')
595 ms ± 19.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

#Andrej Kesely sol1
In [201]: %timeit df[:] = np.fliplr(np.sort(df, axis=1))
559 µs ± 38.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#Andrej Kesely sol2
In [202]: %timeit df.loc[:, ::-1] = np.sort(df, axis=1)
518 µs ± 11 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#jezrael sol2
In [203]: %timeit df[:] = np.sort(df, axis=1)[:, ::-1]
491 µs ± 15.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#jezrael sol1
In [204]: %timeit pd.DataFrame(np.sort(df, axis=1)[:, ::-1], columns=df.columns, index=df.index)
399 µs ± 2.31 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

CodePudding user response:

You can try apply on rows with result_type expand or broadcast

df = df.apply(lambda row: list(reversed(sorted(row))), axis=1, result_type='expand')
print(df)

    0   1   2   3   4
0  48  45  21  20  12
1  41  36  32  29  16
2  42  41  34  13   9
3  39  37  33   7   4
4  39  32  21   3   1

CodePudding user response:

Try np.sort:

df[:] = np.fliplr(np.sort(df, axis=1))
print(df)

Prints:

   N1  N2  N3  N4  N5
0  48  45  21  20  12
1  41  36  32  29  16
2  42  41  34  13   9
3  39  37  33   7   4
4  39  32  21   3   1

Or:

df.loc[:, ::-1] = np.sort(df, axis=1)
  • Related