The table that i have is:
ID Order_ID Period
1234 2305333 Monthly
1234 4148915 Annual
1234 3136815 Monthly
1234 5309581 Monthly
9876 W~100040 Annual
9876 W~132759 Annual
9876 W~30094 Annual
9876 W~339658 Annual
9876 W~58943 Annual
We want this sorted on the basis of ID and Order such that the output looks like:
ID Order_ID Period
1234 2305333 Monthly
1234 3136815 Monthly
1234 4148915 Annual
1234 5309581 Monthly
9876 W~30094 Annual
9876 W~58943 Annual
9876 W~100040 Annual
9876 W~132759 Annual
9876 W~339658 Annual
What we essentially want is to be able to sort the Order_ID based on numerics even though we have both alphabets and numerics and we want to retain the column as it is.
We have tried using the code:
df.loc[pd.to_numeric(df.Order_ID, errors='coerce').sort_values().index]
but are not getting the desired result.
Your help will be highly appreciated.
CodePudding user response:
Use regex to extract the numeric part of the Order ID:
result = (
df.assign(order_no=lambda x: x["Order_ID"].str.extract(".*?(\d )").astype("int"))
.sort_values(["ID", "order_no"])
.drop(columns="order_no")
)