Home > database >  Pandas sort column names by first character after delimiter
Pandas sort column names by first character after delimiter

Time:01-04

I want to sort the columns in a df based on the first letter after the delimiter '-'

df.columns = ['apple_B','cat_A','dog_C','car_D']

df.columns.sort_values(by=df.columns.str.split('-')[1])
TypeError: sort_values() got an unexpected keyword argument 'by'

df.sort_index(axis=1, key=lambda s: s.str.split('-')[1])
ValueError: User-provided `key` function must not change the shape of the array.

Desired columns would be:

'cat_A','apple_B','dog_C','car_D'

Many thanks!

I needed to sort the index names and then rename the columns accordingly:

sorted_index = sorted(df.index, key=lambda s: s.split('_')[1])
# reorder index
df = df.loc[sorted_index]
# reorder columns
df = df[sorted_index]

CodePudding user response:

Use sort_index with the extracted part of the string as key:

df.sort_index(axis=1, key=lambda s: s.str.extract('_(\w )', expand=False))

Output columns:

[cat_A, apple_B, dog_C, car_D]

CodePudding user response:

You can do:

df.columns = ['apple_B','cat_A','dog_C','car_D']

new_cols = sorted(df.columns, key=lambda s: s.str.split('-')[1])
df = df[new_cols]
  • Related