I want to plot my table as a histogram using matplotlib's barh
function. The table is a pandas dataframe with structure:
test
is the independent variable and count
is the dependent variable. I want the bars to be different colors for the different species
(blue, green, orange, etc) so I tried the code
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame(data={'test': np.linspace(1,16,16).astype(int),
'count': np.random.randint(30,120,16),
'species': ['species 1','species 1','species 1','species 2','species 1','species 1','species 2','species 2','species 1','species 2','species 2','species 2','species 1','species 1','species 1','species 1']
}
)
plt.figure(figsize=(14,6))
plt.barh(data=df, y='test', width='count')
# plt.barh(data=df, y='test', width='count', color='species', color_map={'species 1':'b', 'species 2':'g'})
plt.xlabel('frequency')
plt.ylabel('test number')
plt.show()
But I can only get the bars to be one color:
CodePudding user response:
You need to make species the index of the dataframe, then use .loc
to graph the species separately. I also added a legend, but feel free to remove it:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
df = pd.DataFrame(data={'test': np.linspace(1,16,16).astype(int),
'count': np.random.randint(30,120,16),
'species': ['species 1','species 1','species 1','species 2','species 1','species 1','species 2','species 2','species 1','species 2','species 2','species 2','species 1','species 1','species 1','species 1']
}
)
# Set the index to species
df.set_index('species', inplace=True)
# Seperately graph the species
plt.barh(data=df.loc['species 1'], y='test', width='count', label='species 1')
plt.barh(data=df.loc['species 2'], y='test', width='count', label='species 2')
# Add a legend to the graph
plt.legend()
# plt.barh(data=df, y='test', width='count', color='species', color_map={'species 1':'b', 'species 2':'g'})
plt.xlabel('frequency')
plt.ylabel('test number')
plt.show()