I have a data set (CSV file) with monthly benzine prices in Norway from 1986 to 2022. I want to make a new data frame with an average benzine price per year. Furthermore, I managed to do it, but is there a more efficient way to do it? A link to the CSV file is given. Here is what I did,
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
bensin=pd.read_csv("C:/Users/.../BensinPris.csv",sep=',', delimiter=';')
bensinC=bensin.copy()
#Replace comma with dot in bensin price
bensinC.apply(lambda x: x.str.replace(',','.'))
bensinC= bensinC.stack().str.replace(',','.').unstack()
#Convert prices to floats from strings
bensinC['Priser (kr per liter)']=[float(x) for x in bensinC['Priser (kr per liter)']]
# Make a list of average bensin price each year
average=[]
yrs=np.arange(1986,2023,dtype='int')
for i in yrs:
av=bensinC[[x[0:4]==str(i) for x in bensinC['ÅrMåned']]]['Priser (kr per liter)'].mean()
average.append(np.round(av,2))
# Make a data frame for average bensin price
zipped = list(zip(yrs,average))
benisin_average=pd.DataFrame(zipped,columns=['Year','Average bensin price'])
benisin_average
# ploting
X=benisin_average['Year']
y=benisin_average['Average bensin price']
plt.plot(X[:],y[:])
plt.xlabel('Time(Year)')
plt.ylabel('Average bensin price (Norwegian krone/Liter)')
plt.show()