Home > front end >  Trying to create a grouped bar chart but the legend isn't displaying every category
Trying to create a grouped bar chart but the legend isn't displaying every category

Time:10-17

I'm trying to make a simple categorical barplot to show how many extra-marital affairs people had based on their happiness rating (x axis) and age group (in the legend/key). This is my code.

import pandas as pd
import seaborn as sns

url = 'https://vincentarelbundock.github.io/Rdatasets/csv/AER/Affairs.csv'

affairs = pd.read_csv(url)
age_categorical = []
for row in affairs['age']:
    if 0<row<30:
        age_categorical.append("Under 30")
    elif 30<=row<=40:
        age_categorical.append("30 to 40 Years Old")
    elif 40<row<=55:
      age_categorical.append("41 to 55 years old")
    else:
      age_categorical.append("Older than 55")

affairs['age_categorical'] = age_categorical
# count number of affairs for each happiness rating 
affairs_subset = affairs.copy()
affairs_subset = affairs_subset[affairs_subset["affairs_dummy"] != 0]
affairs_rating = affairs_subset.groupby('rating').size()
# create dataframe table for plot
happiness = pd.DataFrame({
    'Happiness rating' : [1, 2, 3, 4, 5],
    'Number of Affairees': [8, 33, 27, 48, 34]
})
# put in barplot
plot = sns.barplot(x = 'Happiness rating', y='Number of Affairees', hue=affairs['age_categorical'],data = happiness).set(title='Happiness and extra-marital affairs')

The resulting bar plot doesn't show all the categories in the legend.

enter image description here

CodePudding user response:

The Values need to be in the same dataframe so that the order is correct. Also you can just group by and create the table you are looking to plot.

affairs_subset = affairs_subset[affairs_subset["affairs"] != 0]
df = affairs_subset.groupby(['rating','age_categorical'])['affairs'].sum().reset_index()
sns.barplot(data=df, x="rating", y="affairs", hue="age_categorical")

enter image description here

  • Related