Home > other >  Sorting Dataframe Columns By Alphabetical & Numerical Order
Sorting Dataframe Columns By Alphabetical & Numerical Order

Time:10-21

I would like to re-index the columns of a dataframe by sorting them by both alphabetical order and numerical order.

In my dataframe df, my columns are named:

Round 1 Position, Round 2 Position, Round 1 Score, Round 2 Score, Round 1 Price, Round 2 Price, ..., Round 10 Position, Round 10 Score, Round 10 Price

I would like them to be ordered as follows:

Round 1 Position, Round 1 Price, Round 1 Score, Round 2 Position, Round 2 Price, Round 2 Score, Round 3 Position, Round 3 Price, Round 3 Score, ..., Round 10 Position, Round 10 Price, Round 10 Score

I have tried the following:

df.reindex(sorted(df.columns, key=lambda x: float(x[1:])), axis=1) 

but had no luck as the column name was a string.

Thank you for taking the time to read through my question and help me. Much appreciated!

CodePudding user response:

Once you sort the column names into their natural order, you can use [] indexing to re-order the columns.

import pandas as pd
import natsort

df = pd.DataFrame({'A2': [1, 2, 3],
               'B1': [4, 5, 6],
               'A1': [7, 8, 9]})

# sort the column names
col_names = df.columns
sorted_col_names = natsort.natsorted(col_names)

# re-order the columns
df = df[sorted_col_names]
print(df)

Output:

   A1  A2  B1
0   7   1   4
1   8   2   5
2   9   3   6

There are several options for sorting alpha-numeric values in the answers to How to sort alpha numeric set in python. I tested the natsort library, and it works with your input.

columns = ['Round 1 Position', 'Round 2 Position', 'Round 1 Score', 'Round 2 Score', 'Round 1 Price',
           'Round 2 Price', 'Round 10 Position', 'Round 10 Score', 'Round 10 Price']
columns_s = natsort.natsorted(columns)
print(columns_s)

Output:

['Round 1 Position', 'Round 1 Price', 'Round 1 Score', 'Round 2 Position', 'Round 2 Price', 'Round 2 Score', 'Round 10 Position', 'Round 10 Price', 'Round 10 Score']
  • Related