I have a pandas dataframe of energy demand vs. time:
0 1
0 20201231T23-07 39815
1 20201231T22-07 41387
2 20201231T21-07 42798
3 20201231T20-07 44407
4 20201231T19-07 45612
5 20201231T18-07 44920
6 20201231T17-07 42617
7 20201231T16-07 41454
8 20201231T15-07 41371
9 20201231T14-07 41793
10 20201231T13-07 42298
11 20201231T12-07 42740
12 20201231T11-07 43185
13 20201231T10-07 42999
14 20201231T09-07 42373
15 20201231T08-07 41273
16 20201231T07-07 38909
17 20201231T06-07 37099
18 20201231T05-07 36022
19 20201231T04-07 35880
20 20201231T03-07 36305
21 20201231T02-07 36988
22 20201231T01-07 38166
23 20201231T00-07 40167
24 20201230T23-07 42624
25 20201230T22-07 44777
26 20201230T21-07 46205
27 20201230T20-07 47324
28 20201230T19-07 48011
29 20201230T18-07 46995
30 20201230T17-07 44902
31 20201230T16-07 44134
32 20201230T15-07 44228
33 20201230T14-07 44813
34 20201230T13-07 45187
35 20201230T12-07 45622
36 20201230T11-07 45831
37 20201230T10-07 45832
38 20201230T09-07 45476
39 20201230T08-07 44145
40 20201230T07-07 41650
I need to convert the time column into hourly data. I know that Python has some tools that can convert dates directly, is there one I could use here or will I need to do it manually?
CodePudding user response:
Well just to obtain a time string you could use str.replace
:
df["time"] = df["0"].str.replace(r'^\d{8}T(\d{2})-(\d{2})$', r'\1:\2')
CodePudding user response:
Assuming the time column is currently a string you could convert it to a datetime using pd.to_datetime
and then extract the hour.
If you want to calculate, say, the average demand for each hour you could then use groupby
.
df['time'] = pd.to_datetime(df['time'], format="%Y%m%dT%H-%M").dt.hour
df_demand_by_hour = df.groupby('time').mean()
print(df_demand_by_hour)
demand
time
0 40167.0
1 38166.0
2 36988.0
3 36305.0
4 35880.0
5 36022.0
6 37099.0
7 40279.5
8 42709.0
9 43924.5
10 44415.5
11 44508.0
12 44181.0
13 43742.5
14 43303.0
15 42799.5
16 42794.0
17 43759.5
18 45957.5
19 46811.5
20 45865.5
21 44501.5
22 43082.0
23 41219.5
CodePudding user response:
i don't know exactly what the -07 means but you can turn the string to datetime by doing:
import pandas as pd
import datetime as dt
df['0'] = pd.to_datetime(df['0'], format = '%Y-%m-%d %H:%M:%S').dt.strftime('%H:%M:%S')
df
0 1
0 23:00:00 39815
1 22:00:00 41387
2 21:00:00 42798
3 20:00:00 44407
4 19:00:00 45612
...