Home > Back-end >  Getting the average value of each hour for specific columns in data frame
Getting the average value of each hour for specific columns in data frame

Time:05-14

I have a data frame with the date/time passed as "parse_dates" and then set as the index column for the data frame.

                      Flow  Enter  Leave    kWE          kW
Time                                                       
2021-08-30 08:50:00  24.22   14.0    5.7  66.56  619.478012
2021-08-30 08:51:00  24.22   14.0    5.7  66.56  619.478012

I want to get the average values of columns 'kWE' and 'Flow' per hour and day though i'm unsure of how to extract the data when the date/time is set as the index.

CodePudding user response:

You can use resample with hourly (h) frequency:

cols = ['Flow', 'kWE']
df.resample('h')[cols].mean()

output:

                      Flow    kWE
Time                             
2021-08-30 08:00:00  24.22  66.56

Or using origin='start' to use the first value as starting point.

Note the different index. Here the result is the same but this might give a different output on larger data depending on how rows will cluster:

df.resample('h', origin='start')[cols].mean()

output:

                      Flow    kWE
Time                             
2021-08-30 08:50:00  24.22  66.56

CodePudding user response:

Use:

 #Preparing data

string = """Time  Flow  Enter  Leave  kWE  kW
2021-08-30 08:50:00  24.22  14.0  5.7  66.56  619.478012
2021-08-30 08:51:00  24.22  14.0  5.7  66.56  619.478012"""

data = [x.split('  ') for x in string.split('\n')]
df = pd.DataFrame(data[1:], columns = data[0])
df['Time']=pd.to_datetime(df['Time'])
df = df.set_index('Time')

#Solution, daily and hourly averages:
df.groupby(df.index.day)[['kWE', 'Flow']].mean()
df.groupby(df.index.hour)[['kWE', 'Flow']].mean()
  • Related