I need to make a stacked barplot using this dataset(head):
data = {'model': ['A1', 'A6', 'A1', 'A4', 'A3'],
'year': [2017, 2016, 2016, 2017, 2019],
'price': [12500, 16500, 11000, 16800, 17300],
'transmission': ['Manual', 'Automatic', 'Manual', 'Automatic', 'Manual'],
'mileage': [15735, 36203, 29946, 25952, 1998],
'fuelType': ['Petrol', 'Diesel', 'Petrol', 'Diesel', 'Petrol'],
'tax': [150, 20, 30, 145, 145],
'mpg': [55.4, 64.2, 55.4, 67.3, 49.6],
'engineSize': [1.4, 2.0, 1.4, 2.0, 1.0]}
df = pd.DataFrame(data)
model year price transmission mileage fuelType tax mpg engineSize
0 A1 2017 12500 Manual 15735 Petrol 150 55.4 1.4
1 A6 2016 16500 Automatic 36203 Diesel 20 64.2 2.0
2 A1 2016 11000 Manual 29946 Petrol 30 55.4 1.4
3 A4 2017 16800 Automatic 25952 Diesel 145 67.3 2.0
4 A3 2019 17300 Manual 1998 Petrol 145 49.6 1.0
I would like the years (1997-2021) on x-axis and numbers ranging from 0 to 100 on the y-axis representing percentages. Finally, I would like the three different fuelTypes to be shown in yearly proportions; Petrol, Diesel and Hybrid.
I've already done the following calculations to get my percentages, per fuelType, per year and now I need to put it on a graph:
fuel_percentage = round((my_data_frame.groupby(['year'])['fuelType'].value_counts()/my_data_frame.groupby('year')['fuelType'].count())*100, 2)
print(fuel_percentage)
Which gives me the following result:
year fuelType
1997 Petrol 100.00
1998 Petrol 100.00
2002 Petrol 100.00
2003 Diesel 66.67
Petrol 33.33
2004 Petrol 80.00
Diesel 20.00
2005 Petrol 71.43
Diesel 28.57
2006 Petrol 66.67
Diesel 33.33
2007 Petrol 56.25
Diesel 43.75
2008 Diesel 66.67
Petrol 33.33
etc...
My main worry is that since the object is not a dataframe I won't be able to use it to make a plot.
Here is an example of the kind of plot I would like (replace players with fuelTypes and y-axis with percentages):
Thanks for the help!
CodePudding user response:
- Tested in
python 3.8.11
,pandas 1.3.3
,matplotlib 3.4.3