I have a dataframe that I want to group and then to plot (barplot) based on value_counts.
import pandas as pd
df = pd.DataFrame({'date': ['jan', 'jan', 'jan', 'jan', 'feb', 'feb', 'feb', 'feb'],
'value': ['low', 'low', 'high', 'medium', 'medium', 'high', 'high', 'medium']})
print(df)
I want to have months on the X-axis (in this case only jan
and feb
) and bars of values for each month on Y-axis. Hight of the bars should depend of count of specific category.
For example, for 'jan' low=2, high=1, medium=1
I have tried:
sns.countplot(x = 'date', hue = 'value', data = df.melt())
But this is my error:
ValueError: Could not interpret input 'date'
This can be done with matplotlib
or seaborn
CodePudding user response:
You can use sns.countplot
to count items from the original dataframe. You can use hue=
to separate out the value
column. order=
can fix an order on the x-values. Similarly, hue_order=
can set an order for the hue categories. (Default, the order of appearance in the dataframe is used.) palette=
can among others be a dictionary to assign a specific color to a specific category.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
df = pd.DataFrame({'date': ['jan', 'jan', 'jan', 'jan', 'feb', 'feb', 'feb', 'feb'],
'value': ['low', 'low', 'high', 'medium', 'medium', 'high', 'high', 'medium']})
sns.set()
ax = sns.countplot(data=df, x='date', hue='value',
# order=['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'dec'],
order=['jan', 'feb'], hue_order=['low', 'medium', 'high'],
palette={'low': 'limegreen', 'medium': 'gold', 'high': 'crimson'})
ax.set_xlabel('') # optionally remove 'date' label, as it is clear from the ticklabels
plt.tight_layout()
plt.show()