Say we have a df with a column defined as a category:
import pandas as pd
df = pd.DataFrame({'Color': ['Yellow', 'Blue', 'Red', 'Red']}, dtype='category') # data type is category
Now say we want to plot these data while removing one of the categorical levels:
# Exclude Yellow, save in new df
df2 = df.loc[df.Color != 'Yellow']
# Plot
df2.value_counts().plot(kind='bar')
Output:
Although the bar for Yellow is not displayed, the Yellow tick label is still visible.
My question: How do we completely remove Yellow from the plot?
I suspect this issue is due to the fact that the data type is category. But I don't want to convert the data type. The type category is sometimes useful, e.g., to reorder levels or other operations.
Ideal solution for me would also work with seaborn, where I found a similar issue:
# Remake a df based on the above and plot with seaborn
df2=pd.DataFrame(df2.value_counts()).reset_index()
import seaborn as sns
from matplotlib import pyplot as plt
sns.catplot(data=df2, x=0, y='Color', kind='bar')
plt.show()
Output:
Dani Mesejo answer works, but only with histograms, I believe. And I need bar plots per se.
CodePudding user response:
Note:
Don't forget to import seaborn
import seaborn as sns
CodePudding user response:
You can convert categorical values to string for the plot (not inplace) your datatypes will remain same in df2:
df2 = df[df['Color'] != 'Yellow']
df2.Color.astype(str).value_counts().plot(kind='bar')
Or You can use hist for that
df2 = df[df['Color'] != 'Yellow']['Color']
plt.hist(df2)
plt.xlabel('Color')