I have the following data frame:
https://i.stack.imgur.com/b5CMd.png (44 rows)
I tried sorting it by using CategoricalIndex() but found out it can only be done if there are no repeat in month values. Any one know how to sort it chronologically.
CodePudding user response:
Hope it helps
# Add new column in format YYYYMM
df['year_month'] = df['year'] (pd.to_datetime(df.month , format='%B').dt.month.astype(str).str.zfill(2))
# sort values
df.sort_values(by=['year_month'], inplace=True)
CodePudding user response:
You can also try:
df['date']=(df['month'] ' ' df['year'])
df['date']=pd.to_datetime(df['date'])
df=df.sort_values('date')
CodePudding user response:
Always try to post your code.
In this way we could figure out why your categorical sorting did not work out. But I suspect you forgot the ordered=True
parameter.
Categorical ordering allows sorting according to a custom order, and works perfectly for this case. It also handles well duplicated month values. Here is my code:
df["month"] = pd.Categorical(df["month"],
categories=["January", "February", "March", "April", "May", "June", "July",
"August", "September", "October", "November", "December"],
ordered=True)
And after that we can call the function sort_values()
:
df = df.sort_values(["year", "month"], ignore_index=True)
Cheers and keep it up!