I try to produce a stacked barplot showing some categories. However, the current dataframe seems difficult to stack categories together. Also, some years has no count, and this should be removed. Any ideas how to produce something like pic below. Highly appreciate your time.
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
df=pd.read_csv(r"https://raw.githubusercontent.com/tuyenhavan/Course_Data/main/test_barplot.csv")
fig,ax=plt.subplots(figsize=(15,8))
ax.bar(df.Year,df.Number)
plt.xticks(np.arange(2000,2022),np.arange(2000,2022))
plt.xlabel("Year", fontsize=15)
plt.ylabel("Number", fontsize=15)
plt.xticks()
plt.show()
CodePudding user response:
You can reshape the data such that the stacked categories are columns. Then you can use pandas plot.bar with stacked=True
. reindex
adds the missing years.
fig, ax=plt.subplots(figsize=(15,8))
df_stack = df.pivot_table(index="Year",
columns="Category",
values="Number",
aggfunc=sum)
df_stack = df_stack.reindex(np.arange(2000, 2022))
df_stack.plot.bar(stacked=True, ax=ax)
plt.xlabel("Year", fontsize=15)
plt.ylabel("Number", fontsize=15)
Double Agriculture is due to one with and one without trailing space.