I have a sample data frame created from the columns of two different data frames.
The code for that looks like this:
import pandas as pd
pvgis_df = pd.read_csv(pvgis_file)
month = pd.Series(pvgis_df["Month"],)
pvgis_generated = pd.Series(pvgis_df["Avg Monthly Energy Production"],)
pvoutput_generated = pd.Series(pvoutput_df["Generated (KWh)"],)
frame = {
"Month": month, "PVGIS Generated": pvgis_generated,
"PVOUTPUT Generated": pvoutput_generated
}
joined_df = pd.DataFrame(frame)
And output looks like this:
Month PVGIS Generated PVOUTPUT Generated
0 1.0 107434.69 80608.001709
1 2.0 112428.41 106485.000610
2 3.0 153701.40 132772.003174
3 4.0 179380.47 148830.993652
4 5.0 200402.90 177705.001831
5 6.0 211507.83 173893.005371
6 7.0 233932.95 182261.993408
7 8.0 223986.41 174046.005249
8 9.0 178682.94 142970.993042
9 10.0 142141.02 107087.997437
10 11.0 108498.34 73358.001709
11 12.0 101886.06 73003.997803
Now I want to plot the other columns against Month and I have my code looking like this
from matplotlib import pyplot as plt
label = [
df["Month"], df["PVGIS Generated"],
df["PVOUTPUT Generated"]
]
figure_title = f"{plt.xlabel} VS {plt.ylabel}"
fig = plt.figure(figure_title)
fig.set_size_inches(13.6, 7.06)
plot_no = df.shape
filename = f"{folder}_joined"
color="blue"
plt.legend()
plt.xlabel("Month")
plt.ylabel("Generated")
plt.grid()
plt.margins(x=0)
plt.ticklabel_format(useOffset=False, axis="y", style="plain")
plt.bar(df[label[0]], df[label[1]])
plt.bar(df[label[0]], df[label[2]])
plt.show()
plt.close()
When I run it, I get a key error
KeyError: "None of [Float64Index([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0], dtype='float64')] are in the [columns]
I have tried making reindexing and making the month column an index but I keep running into different versions of KeyError
.
What may I be missing?
Does this mean the column is missing from the dataframe? If yes how come?
CodePudding user response:
The error is caused by the fact that in label
you are listing the dataframe series in place of the columns names only; try with:
label = ["Month", "PVGIS Generated", "PVOUTPUT Generated"]