I want to plot some statistics results for each region. I have nested 'for' loops, where in the inner loop it generates the statistics, in the outer loop it selects the regions and plot the respective statistic results. Not sure why my code plots data from all regions into the same figure, not one figure for a region.
Yrstat = []
for j in Regionlist:
for i in Yrlist:
dfnew = df.loc[(df['Yr']==i)&(df['Region']==j)]
if not dfnew.empty:
#calculate the confidence interval and mean for data in each year
CI = scipy.stats.norm.interval(alpha=0.95, loc=np.mean(dfnew['FluxTot']), scale=scipy.stats.sem(dfnew['FluxTot']))
list(CI)
mean = np.mean (dfnew['FluxTot'])
Yrstat.append((i, mean, CI[0], CI[1]))
#convert stats list to a dataframe
yrfullinfo = pd.DataFrame(Yrstat, columns = ['Yr', 'mean', 'CI-','CI '])
#making figures
fig, ax =plt.subplots()
ax.plot(yrfullinfo['Yr'], yrfullinfo['mean'], label = 'mean')
ax.plot(yrfullinfo['Yr'], yrfullinfo['CI-'], label = '95%CI')
ax.plot(yrfullinfo['Yr'], yrfullinfo['CI '], label = '95%CI')
ax.legend()
#exporting figures
filename = "C:/Users/Christina/Desktop/python test/Summary in {}.png". format (j)
fig.savefig(filename)
plt.close(fig)
CodePudding user response:
The problem wasn't the figure, the script saves a png
file for each region in an own plot which is correct. The problem is your data.
You intitialize Yrstat=[]
outside of both loops. Then you append data to it in every step of the inner loop (and also all outer loops) and plot the data of the "new" DataFrame yrfullinfo
. This DataFrame grows bigger with each iteration.
You need to create a new list of values for each Region, that's why I moved the list Yrstat
in the outer loop to get reinitialized for every region.
for j in Regionlist:
Yrstat = []
for i in Yrlist:
dfnew=dfmerge.loc[(dfmerge['Yr']==i)&(dfmerge['Region']==j)]
if not dfnew.empty:
#get all statistics for data in each year
CI = st.norm.interval(alpha=0.95, loc=np.mean(dfnew['FluxTot']), scale=st.sem(dfnew['FluxTot']))
list(CI)
mean = np.mean (dfnew['FluxTot'])
Yrstat.append((j, i, mean, CI[0], CI[1]))