How to aggregate values in Pandas?-CodePudding

I have a pandas dataframe that you can see in this screenshot:

It has a time resolution of 1 minute. I would like to aggregate the first 5 columns and calculate the rolling average and for the 5th column ("Availability of the EV") I would just like to take the value every 5 minutes. For this I have the following code:

import pandas as pd


originalData = pd.read_csv('C:/Users/Desktop/Test_1Minute.csv', sep =";")
originalData['Time'] = pd.to_datetime(originalData['Time'], format = '%d.%m.%Y %H:%M')

adjustedData = originalData.set_index('Time').asfreq('5Min')
#Aggregate
adjustedData = originalData['Space Heating [W]'].resample('5Min').mean()
adjustedData['Electricity [W]'] = adjustedData['Electricity [W]'].resample('5Min').mean()
adjustedData['PV [nominal]'] = adjustedData['PV [nominal]'].resample('5Min').mean()
adjustedData['DHW [W]'] = adjustedData['DHW [W]'].resample('5Min').mean()

#Write new dataframe to csv 
adjustedData.to_csv('C:/Users/Desktop/Test_5Minutes.csv', sep =";")

When I run this code I get the following error

"TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'RangeIndex'"

Do you know why this issue occurs and how to solve it?

CodePudding user response：

The problem with your provided code is you set the index on a new dataframe adjustedData, but then resample the originalData. Below is the original resampled code (which is not a rolling mean!), that aggregates all columns; and an actual rolling mean for the first columns, retaining the original values for the last column.

import pandas as pd


originalData = pd.read_csv('C:/Users/Desktop/Test_1mintue.csv', sep =";"), sep =";")
originalData['Time'] = pd.to_datetime(originalData['Time'], format = '%d.%m.%Y %H:%M')

# grouping by 5 min windows:
adjustedData = originalData.set_index('Time').resample("5Min").mean()

# using rolling mean for specific columns
mean_cols = ['Space Heating [W]', 'Electricity [W]','PV [nominal]', 'DHW [W]']
adjustedData = originalData.set_index('Time').assign(**{col: lambda dataf: dataf[col].rolling('5Min').mean() for col in mean_cols})

#Write new dataframe to csv 
adjustedData.to_csv('C:/Users/Desktop/Test_5mintues.csv', sep =";")