I have a messy table as per below, the data inside the table have integer and word/character. I want to sort the table from Info 5 to Info 1.
Fruit | Info 1 | Info 2 | Info 3 | Info 4 | Info 5 |
---|---|---|---|---|---|
Apple | q | z | 2 | ||
Grape | w | 4 | |||
Guava | e | 7 | u | ||
Kiwi | r | n | s | m |
Wish to get the result table as below.
Fruit | Info 1 | Info 2 | Info 3 | Info 4 | Info 5 |
---|---|---|---|---|---|
Kiwi | r | n | s | m | |
Guava | e | 7 | u | ||
Apple | q | z | 2 | ||
Grape | w | 4 |
I have tried to use the str.contains, however it couldn't detect the integer.
i = ['Fruit', 'Info 1', 'Info 2', 'Info 3', 'Info 4', 'Info 5',]
data[data[i].str.contains('', na=False)
CodePudding user response:
My solution is to create new 5 columns with number mapping from old 5 columns. This make it easier to sort. The key function is map
, there are other equivalent like replace
# Step 1/3: create a dictionary to convert alphabet & 0-9 to new number value only:
convert_dict = {'a':1,
'b':2,
'c':3} # You can start filling till z. Then for 0-9 accumulate
# Step 2/3: iterate over 5 columns, create 5 `new {i}` columns that can be used for sort:
for i in range(1,6):
df[f'new {i}'] = df[f'Info {i}'].map(convert_dict)
# Step 3/3: sort
df.sort_values(by=[f'new {i}' for i in range(1,6)], ignore_index=True, inplace=True)
df.drop(columns=[f'new {i}' for i in range(1,6)], inplace=True) # remove if you no longer need