I have a pandas dataframe that you can see in this screenshot:
It has a time resolution of 1 minute. I would like to aggregate the first 5 columns and calculate the rolling average and for the 5th column ("Availability of the EV") I would just like to take the value every 5 minutes. For this I have the following code:
import pandas as pd
originalData = pd.read_csv('C:/Users/Desktop/Test_1Minute.csv', sep =";")
originalData['Time'] = pd.to_datetime(originalData['Time'], format = '%d.%m.%Y %H:%M')
adjustedData = originalData.set_index('Time').asfreq('5Min')
#Aggregate
adjustedData = originalData['Space Heating [W]'].resample('5Min').mean()
adjustedData['Electricity [W]'] = adjustedData['Electricity [W]'].resample('5Min').mean()
adjustedData['PV [nominal]'] = adjustedData['PV [nominal]'].resample('5Min').mean()
adjustedData['DHW [W]'] = adjustedData['DHW [W]'].resample('5Min').mean()
#Write new dataframe to csv
adjustedData.to_csv('C:/Users/Desktop/Test_5Minutes.csv', sep =";")
When I run this code I get the following error
"TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex'"
Do you know why this issue occurs and how to solve it?
CodePudding user response:
The problem with your provided code is you set the index on a new dataframe adjustedData
, but then resample the originalData
. Below is the original resampled code (which is not a rolling mean!), that aggregates all columns; and an actual rolling mean for the first columns, retaining the original values for the last column.
import pandas as pd
originalData = pd.read_csv('C:/Users/Desktop/Test_1mintue.csv', sep =";"), sep =";")
originalData['Time'] = pd.to_datetime(originalData['Time'], format = '%d.%m.%Y %H:%M')
# grouping by 5 min windows:
adjustedData = originalData.set_index('Time').resample("5Min").mean()
# using rolling mean for specific columns
mean_cols = ['Space Heating [W]', 'Electricity [W]','PV [nominal]', 'DHW [W]']
adjustedData = originalData.set_index('Time').assign(**{col: lambda dataf: dataf[col].rolling('5Min').mean() for col in mean_cols})
#Write new dataframe to csv
adjustedData.to_csv('C:/Users/Desktop/Test_5mintues.csv', sep =";")