I'm looking to plot error bars on a line plot I did using pandas's .plot()
function
scores_xgb.plot(x='size', y='MSE_mean_tot', kind='line',logx=True,title='XGBoost 5 samples, 5 fold CV')
Running this gives me the following plot:
For plotting the error bars, I choose to use .errorbar()
from Matplotlib. When running the cell
plt.errorbar(x=scores_xgb.size, y=scores_xgb.MSE_mean_tot, yerr=std_xgb ,title='XGBoost 5 samples, 5 fold CV')
plt.show()
I receive the following error message:
ValueError: 'x' and 'y' must have the same size
This confuses me, as I use the same Dataframe in both examples, each time using the same variable for x
and y
respectively, therefore it has the same size (12) both times.
NB: the yerr = std_xgb
also has size 12.
CodePudding user response:
There is a property on pandas.DataFrame
objects named size
, and it's a number, equal to the number of cells in the DataFrame (the product of the values in df.shape
). You're trying to access a column named size
, but pandas chooses the property named size
before it chooses the column name size
. Since the single number has a shape of 1 but the columns in the dataframe have a length of 12, you're getting a shape mismatch.
Instead, use strings to index the dataframe and get the the columns:
plt.errorbar(x=scores_xgb['size'], y=scores_xgb['MSE_mean_tot'], yerr=std_xgb, title='XGBoost 5 samples, 5 fold CV')