my df is like this:
timestamp power
0 2022-01-01 00:00:00 100.000000
1 2022-01-01 00:00:01 100.004526
2 2022-01-01 00:00:02 100.009053
3 2022-01-01 00:00:03 100.013579
4 2022-01-01 00:00:04 100.018105
... ... ...
31535995 2022-12-31 23:59:55 136.750000
31535996 2022-12-31 23:59:56 136.560000
31535997 2022-12-31 23:59:57 136.440000
31535998 2022-12-31 23:59:58 136.380000
31535999 2022-12-31 23:59:59 136.530000
[31536000 rows x 2 columns]
I have a super simple script:
directory = 'data/peak_shaving/20220803_132445'
df = pd.read_csv(f'{directory}/demand_profile_simulation.csv')
df['timestamp'] = pd.to_datetime(df['timestamp'])
df = df.groupby(pd.PeriodIndex(df['timestamp'], freq="15min"))['power'].mean()
the result for this is:
timestamp
2022-01-01 00:00 100.133526
2022-01-01 00:01 100.405105
2022-01-01 00:02 100.676684
2022-01-01 00:03 100.948263
2022-01-01 00:04 101.219842
...
2022-12-31 23:55 153.952833
2022-12-31 23:56 150.040333
2022-12-31 23:57 146.124167
2022-12-31 23:58 142.225833
2022-12-31 23:59 138.318167
Freq: 15T, Name: power, Length: 525600, dtype: float64
as you can see it is grouped as minutes, not as 15 min intervals.
When I try other freq
like one day it works perfectly:
2022-01-01 120.291041
2022-01-02 126.085428
2022-01-03 120.840020
2022-01-04 124.335800
2022-01-05 119.230694
...
2022-12-27 125.802254
2022-12-28 123.833951
2022-12-29 126.609810
2022-12-30 123.971885
2022-12-31 122.798069
Freq: D, Name: power, Length: 365, dtype: float64
Also tested hours and many other freq
and it works but I can not make it work for 15in intervals, is there any issue in my code? Thanks
CodePudding user response:
For me working your solution correct, here is altenative with Series.dt.to_period
:
df = pd.read_csv(f'{directory}/demand_profile_simulation.csv', parse_dates=['timestamp'])
df = df.groupby(df['timestamp'].dt.to_period('15Min'))['power'].mean()
Another solutions:
df = pd.read_csv(f'{directory}/demand_profile_simulation.csv', parse_dates=['timestamp'])
df = df.groupby(pd.Grouper(key='timestamp', freq="15min"))['power'].mean()
#alternative
#df = df.resample("15min", on='timestamp')['power'].mean()
CodePudding user response:
You can go through this link https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.date_range.html
I think this may help
ex:
pd.Series(pd.date_range(
'1/1/2020', '1/2/2020', freq='15min', closed='left')).dt.time