How to transform this data so that the pm 2.5 pm 10 columns are the average of the whole day. The data I collected (example here below) collects data every 15 minutes.
Pm 2.5 Pm 10 Created At
0 6.00 19.20 2021-06-21 19:00
1 4.70 17.00 2021-06-21 19:15
2 4.80 16.70 2021-06-21 19:30
3 5.10 12.10 2021-06-21 19:45
4 7.90 19.10 2021-06-21 20:00
CodePudding user response:
Let's resample
the dataframe:
df['Created At'] = pd.to_datetime(df['Created At'])
df.resample('D', on='Created At').mean()
Pm 2.5 Pm 10
Created At
2021-06-21 5.7 16.82
CodePudding user response:
You can use pd.Grouper
and then transform
if you want to preserve the dataframe shape:
df['Created At'] = pd.to_datetime(df['Created At'])
df[['Pm 2.5', 'Pm 10']] = df.groupby(pd.Grouper(key='Created At', freq='D'))\
[['Pm 2.5', 'Pm 10']].transform('mean')
Output:
Pm 2.5 Pm 10 Created At
0 5.7 16.82 2021-06-21 19:00:00
1 5.7 16.82 2021-06-21 19:15:00
2 5.7 16.82 2021-06-21 19:30:00
3 5.7 16.82 2021-06-21 19:45:00
4 5.7 16.82 2021-06-21 20:00:00
CodePudding user response:
here is one way do it
convert the date using to_datetime, grab the date and carry out the mean
df.groupby(pd.to_datetime(df['Created At']).dt.date).mean()
Created At Pm 2.5 Pm 10
0 2021-06-21 5.7 16.82