Home > OS >  Sorting correctly month in a plot
Sorting correctly month in a plot

Time:12-16

I wrote a code (in pyspark taht applies to a dataframe pandas) to have a plot of a historical series: quantity trend in years. On the x-axis I put the name of the months. How can I order them correctly and not alphabetically?

Thanks

plt.figure()
pd_filter = df[df["date"] < pd.to_datetime("2021-12-01")].copy()
pd_new = pd.DataFrame(pd_filter.groupby(["Month","Year"])["quantity"].count()).reset_index()
ax = pd_new.set_index("Month").groupby("Year")["quantity"].plot(legend=True, figsize=(10,5), title = "Quantity per Year")
plt.xlabel("Months")
plt.ylabel("Quantity")
plt.show()

CodePudding user response:

You can map from month string to month number with a dictionary, and then use month number as the index and sort the index. Here is a small example showing this:

import pandas as pd

df = pd.DataFrame(
    {"Month": ["Feb", "Jan", "Mar", "May", "Jun", "Jul", "Oct", "Apr", "Aug", "Sep", "Dec", "Nov"],
     "Val": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]})

month_map = {"Jan": 1, "Feb": 2, "Mar": 3, "Apr": 4, "May": 5, "Jun": 6, "Jul": 7, "Aug": 8, "Sep": 9, "Oct": 10,
             "Nov": 11, "Dec": 12}

df["Month_val"] = df["Month"].map(month_map)
df_new_index = df.set_index("Month_val").sort_index()
df_new_index.plot.line(x="Month", y="Val")

You will probably need to change the strings in the dictionary to the strings that you use for the months.

Output:

enter image description here

  • Related