Home > Mobile >  Pandas/Matplotlib: How do I plot in groups and color code based on another column?
Pandas/Matplotlib: How do I plot in groups and color code based on another column?

Time:06-07

I hoped this would be very simple, but I wasted way too much time on this already. There has to be a simple way of doing this.

I have a very simple dataframe:

enter image description here

I want to simply plot a bar chart, that groups by the column "data_range", so that i have three bars indicating the different mean values for the three "trade_types".

df.groupby('data_range')['mean'].plot(legend=True)

The closest I got to making this happen was with this code. It returned this plot: enter image description here

Which is already close, except that I want bars, label each group with the corresponding data_range and have the same color for each trade_type (also displayed in the legend). If I use .bar after .plot, I receive three different plots instead of one. How do I simply create a bar plot, that shows each data_range group and makes it comparable?

CodePudding user response:

You can first pivot your table and then bar plot will work as you want.

import pandas as pd

#making a table like yours but with different values
df = pd.DataFrame({
    'data_range':['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'],
    'trade_type':['T1', 'T2', 'T3', 'T1', 'T2', 'T3', 'T1', 'T2', 'T3'],
    'mean':[17, 11, 18, 15, 15, 11, 11,  6, 16],
})

#pivot the table so each trade type is a column
piv_df = df.pivot(index='data_range',columns='trade_type',values='mean')

#print(piv_df) #this is what the pivoted table looks like
#            T1  T2  T3
#A           17  11  18
#B           15  15  11
#C           11   6  16


piv_df.plot.bar()

enter image description here

There's also a great plotting library called seaborn which is more powerful than the pandas built-in plots that allows you to make more customization. Here's an example of how the same plot could be accomplished in seaborn

import seaborn as sns
import pandas as pd

#making a table like yours but with different values
df = pd.DataFrame({
    'data_range':['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'],
    'trade_type':['T1', 'T2', 'T3', 'T1', 'T2', 'T3', 'T1', 'T2', 'T3'],
    'mean':[17, 11, 18, 15, 15, 11, 11,  6, 16],
})

sns.barplot(
    x = 'data_range',
    y = 'mean',
    hue = 'trade_type',
    data = df,
)
  • Related