Home > Mobile >  Grouping variables to plot multilevel bar chart
Grouping variables to plot multilevel bar chart

Time:07-06

Can you please help me with the following. I have a dataset with a variable - number of products (Prod) that takes discrete values from 1 to 3 (included). Then I have a variable (Gender) 1 for males, 0 for females. I want to plot a multilevel bar chart where on the x-axis I have number of products (Prod) and on the y-axis I have total value of these products that are grouped by the Gender. I need to create a 'count' variable that counts how many observations of each 'Prod' are in each 'Gender' category. To group and plot the variables I use the following code (which does not work):

#Group the variables
grouped_gender['count'] = main_data.groupby(['Prod', 'Gender'])[['Prod']].count()
grouped_gender = pd.DataFrame(grouped_gender)

#Plot
fig, axes = plt.subplots(nrows=1, ncols=1, figsize=(10, 7))

barplot2 = sns.barplot(
    data=grouped_gender,
    x='Prod',
    y='count',
    hue='Gender',
    orient='v',
    ax = axes,
    ci=None,
    dodge=False
    )

Can you please help me to identify the problem?

CodePudding user response:

Use countplot on the original dataset:

# sample dataset
df = sns.load_dataset('tips')

# `day` plays `Prod`, `sex` plays `Gender`
sns.countplot(x='day', hue='sex', data=df)

Output:

enter image description here

Note: if you want the data, not just the plot, use:

counts = pd.crosstab(df['day'], df['sex'])

# then to plot bar chart
# counts.plot.bar()

which gives you:

sex   Male  Female
day               
Thur    30      32
Fri     10       9
Sat     59      28
Sun     58      18

CodePudding user response:

Assuming you can put your DataFrame in a similar state as mine

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

grouped_gender = pd.DataFrame(
    {
        "Man": [50, 70],
        "Woman": [90, 30]
    },
    index=["Product1", "Product2"]
)

grouped_gender.plot(kind="bar", stacked=True)

plt.title("Products sales")
plt.xlabel("Products")
plt.ylabel("Sales")
plt.show()

This produces the following resultenter image description here

  • Related