I trying to compare the sum of different complaint types based on the top 5 cities in number of complaint. However, I do not seem to do it correctly. I managed to plot every single complaint individually based on the city. But, my goal is to plot the different complaints in a single bar chart. this is a sample on the data:
ID coustumer | city | num_money_complaint | num_food_complaint |
---|---|---|---|
0 | city0 | 0 | 2 |
1 | city1 | 2 | 1 |
2 | city0 | 4 | 1 |
3 | city2 | 1 | 1 |
4 | city2 | 2 | 3 |
5 | city2 | 0 | 1 |
6 | city3 | 3 | 4 |
7 | city4 | 3 | 1 |
this is not an accurate graph but just to illustrate my idea better. this is how i want the plot to look like
First I tried groupby like this
(df.
groupby(['num_money_complaint','num_food_complaint'])['city']
.sum()
.unstack()
).plot.bar(width=0.7)
but i got the complaint as x axis not as y value for the sum of every complaint in a separate column.
then I tried this way with goupby
df=df.groupby('city')['num_money_complaint','num_food_complaint'].sum()
f, ax = plt.subplots(figsize=(8,5))
df.nlargest(10 , columns=['num_money_complaint','num_food_complaint']).plot.bar()
this way showed me a wrong result, it only showed me the number of money complaint as you can see in the graph
I also have tried many many ways that would not work. please help
CodePudding user response:
Your approche is pretty close to a working solution. You were only grouping the wrong column.
Please try
df.groupby('city').sum().plot.bar(width=0.7)
which will create the figure below:
CodePudding user response:
Seaborn is pretty easy to use for such tasks, look in the
UPDATE
I had another look into plotting directly from df with matplotlib as the backend.
With less data transformation you can also do it like this (result is the same, the style is just a bit different)
(df
.groupby('city')
.sum()
.reset_index()
.plot(x='city', y=['num_money_complaint', 'num_food_complaint'], kind='bar', width=0.7)
)