I want to sort the columns in a df based on the first letter after the delimiter '-'
df.columns = ['apple_B','cat_A','dog_C','car_D']
df.columns.sort_values(by=df.columns.str.split('-')[1])
TypeError: sort_values() got an unexpected keyword argument 'by'
df.sort_index(axis=1, key=lambda s: s.str.split('-')[1])
ValueError: User-provided `key` function must not change the shape of the array.
Desired columns would be:
'cat_A','apple_B','dog_C','car_D'
Many thanks!
I needed to sort the index names and then rename the columns accordingly:
sorted_index = sorted(df.index, key=lambda s: s.split('_')[1])
# reorder index
df = df.loc[sorted_index]
# reorder columns
df = df[sorted_index]
CodePudding user response:
Use sort_index
with the extracted part of the string as key
:
df.sort_index(axis=1, key=lambda s: s.str.extract('_(\w )', expand=False))
Output columns:
[cat_A, apple_B, dog_C, car_D]
CodePudding user response:
You can do:
df.columns = ['apple_B','cat_A','dog_C','car_D']
new_cols = sorted(df.columns, key=lambda s: s.str.split('-')[1])
df = df[new_cols]