I wrote a code (in pyspark taht applies to a dataframe pandas) to have a plot of a historical series: quantity trend in years. On the x-axis I put the name of the months. How can I order them correctly and not alphabetically?
Thanks
plt.figure()
pd_filter = df[df["date"] < pd.to_datetime("2021-12-01")].copy()
pd_new = pd.DataFrame(pd_filter.groupby(["Month","Year"])["quantity"].count()).reset_index()
ax = pd_new.set_index("Month").groupby("Year")["quantity"].plot(legend=True, figsize=(10,5), title = "Quantity per Year")
plt.xlabel("Months")
plt.ylabel("Quantity")
plt.show()
CodePudding user response:
You can map
from month string to month number with a dictionary, and then use month number as the index and sort the index. Here is a small example showing this:
import pandas as pd
df = pd.DataFrame(
{"Month": ["Feb", "Jan", "Mar", "May", "Jun", "Jul", "Oct", "Apr", "Aug", "Sep", "Dec", "Nov"],
"Val": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]})
month_map = {"Jan": 1, "Feb": 2, "Mar": 3, "Apr": 4, "May": 5, "Jun": 6, "Jul": 7, "Aug": 8, "Sep": 9, "Oct": 10,
"Nov": 11, "Dec": 12}
df["Month_val"] = df["Month"].map(month_map)
df_new_index = df.set_index("Month_val").sort_index()
df_new_index.plot.line(x="Month", y="Val")
You will probably need to change the strings in the dictionary to the strings that you use for the months.
Output: