Home > database >  How to draw a graph on python where there are two x-variables, one of which needs to be calculated f
How to draw a graph on python where there are two x-variables, one of which needs to be calculated f

Time:11-20

I have a CSV file that includes the results from a survey where the user was asked to answer their age (0 to 100) and their mood ( 0=happy 1=mid 2=sad). I intend to make a bar chart on python using matplotlib or any other graphing library with the number of people of each age on the y-axis and then a triple bar on the x-axis to show how many sad, happy and mid people there are for each age. The issue is in the CSV file there is no column that directly contains data about the count of the total number of people of each age, number of happy people of each age and number of sad people of each age etc. Any tips on how to tackle this issue would be very helpful. The table below shows a couple lines of the CSV file. Thanks

Age Mood level
12 0
83 1
55 ​ 2

CodePudding user response:

Suppose we have the following dataframe:

import pandas as pd
from matplotlib import pyplot as plt

df = pd.DataFrame(
    {
        "Age": [20, 16, 16, 20, 20, 16, 18, 18, 18, 20, 16, 16, 18, 18, 18, 20, 20],
        "Mood Level": [0, 2, 1, 2, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 2, 2, 1],
    }
)

Then we need to create some encoding based on the Mood Level, meaning we need to create columns Mood_Level_0, Mood_Level_1 and Mood_Level_2 with values 0 (False) or 1 (True). This can be done via:

df = pd.concat([df, pd.get_dummies(df["Mood Level"], prefix="Mood_Level")], axis=1)

and will result in:

    Age  Mood level  Mood_Level_0  Mood_Level_1  Mood_Level_2
0    20           0             1             0             0
1    16           2             0             0             1
2    16           1             0             1             0
3    20           2             0             0             1
4    20           0             1             0             0
5    16           1             0             1             0
6    18           0             1             0             0
7    18           1             0             1             0
8    18           1             0             1             0
9    20           0             1             0             0
10   16           0             1             0             0
11   16           1             0             1             0
12   18           0             1             0             0
13   18           1             0             1             0
14   18           2             0             0             1
15   20           2             0             0             1
16   20           1             0             1             0

Finally, we need to group by Age and sum the 1s for each of the above created columns:

grouped_per_age = df.groupby(["Age"], as_index=True,).agg(
    mood_level_0=("Mood_Level_0", "sum"),
    mood_level_1=("Mood_Level_1", "sum"),
    mood_level_2=("Mood_Level_2", "sum"),
)

This will result in:

     mood_level_0  mood_level_1  mood_level_2
Age
16              1             3             1
18              2             3             1
20              3             1             2

Plotting the above dataframe:

ax = grouped_per_age.plot.bar(rot=0)
plt.xlabel("Age")
plt.ylabel("Count")
plt.legend()
plt.show()

results in: enter image description here

  • Related