I have 4 dataframes
df1 = pd.DataFrame({'ID': [0, 0, 0, 0, 0, 0],
'value': [3.0, 3.5, 4.5, NaN, 7.0, 8.1]})
df2 = pd.DataFrame({'ID': [1, 1, 1, 1, 1, 1],
'value': [9.4, NaN, 4.5, 2.4, 4.0, 3.9]})
df3 = pd.DataFrame({'ID': [2, 2, 2],
'value': [1.0, 3.9, 4.1]})
df4 = pd.DataFrame({'ID': [3, 3, 3, 3],
'value': [NaN, NaN, 5.8, 3.0]})
I want to make a boxplot with values in the column value
in each of the dataframe.
I did the following
fig, ax2 = plt.subplots()
vec = [df1['value'].values,df2['value'].values,df3['value'].values,df4['value'].values]
labels = ['ID_0','ID_1', 'ID_2', 'ID_3']
ax2.boxplot(vec, labels = labels)
ax2.set_title('Values')
plt.show()
But it doesn't work and throws me an empty plot. Is there a better way to do this?
CodePudding user response:
To identify NaN, you need to use np.nan
(use import numpy as np
if required). Also, you need to dropna() before plotting. Making the changes...
df1 = pd.DataFrame({'ID': [0, 0, 0, 0, 0, 0], 'value': [3.0, 3.5, 4.5, np.nan, 7.0, 8.1]}).dropna()
df2 = pd.DataFrame({'ID': [1, 1, 1, 1, 1, 1], 'value': [9.4, np.nan, 4.5, 2.4, 4.0, 3.9]}).dropna()
df3 = pd.DataFrame({'ID': [2, 2, 2], 'value': [1.0, 3.9, 4.1]}).dropna()
df4 = pd.DataFrame({'ID': [3, 3, 3, 3],'value': [np.nan, np.nan, 5.8, 3.0]}).dropna()
fig, ax2 = plt.subplots()
vec = [df1['value'].values,df2['value'].values,df3['value'].values,df4['value'].values]
labels = ['ID_0','ID_1', 'ID_2', 'ID_3']
ax2.boxplot(vec, labels = labels)
ax2.set_title('Values')
plt.show()
gives you...