I have two different data frames.
df1=
index Time
0 2009-09-13 01:17:00
1 2009-09-13 02:17:00
2 2009-09-13 03:17:00
3 2009-09-13 04:17:00
4 2009-09-13 05:17:00
............
This is a time-series data with hourly interval spanning in total 10 days. and
df2=
Report Time x
0 2009-09-13 01:17:00 1
1 2009-09-13 02:20:00 27
2 2009-09-13 02:25:00 1
3 2009-09-13 05:33:00 100
..............
I want to iterate the "df1" hourly time data through "df2" and do the summation of "x" to its hourly value. Then The hourly data will store the x value.
The output should look like this:
index Time x
0 2009-09-13 01:17:00 1
1 2009-09-13 02:17:00 28
2 2009-09-13 03:17:00 0
3 2009-09-13 04:17:00 0
4 2009-09-13 05:17:00 100
Note that all of the x value should be added together within the same hour for the dataset "df1".
I have tried to implement for loop but couldn't complete it.
CodePudding user response:
Use DatetimeIndex.floor
to get same time between the 2 dataframes:
# Ensure the columns have datetime64 dtype
df1['Time'] = pd.to_datetime(df1['Time'])
df2['Report Time'] = pd.to_datetime(df2['Report Time'])
out = (
df1.merge(df2.groupby(df2['Report Time'].dt.floor('H')).sum(),
left_on=df1['Time'].dt.floor('H'), right_index=True, how='left')
.fillna(0)
)
Output:
>>> out
Time x
0 2009-09-13 01:17:00 1.0
1 2009-09-13 02:17:00 28.0
2 2009-09-13 03:17:00 0.0
3 2009-09-13 04:17:00 0.0
4 2009-09-13 05:17:00 100.0
Details:
>>> df2.groupby(df2['Report Time'].dt.floor('H')).sum()
x
Report Time
2009-09-13 01:00:00 1
2009-09-13 02:00:00 28
2009-09-13 05:00:00 100
>>> df1['Time'].dt.floor('H')
0 2009-09-13 01:00:00
1 2009-09-13 02:00:00
2 2009-09-13 03:00:00
3 2009-09-13 04:00:00
4 2009-09-13 05:00:00
Name: Time, dtype: datetime64[ns]
CodePudding user response:
You need to perform a merge_asof
:
df1['Time'] = pd.to_datetime(df1['Time'])
df2['Report Time'] = pd.to_datetime(df2['Report Time'])
(pd
.merge_asof(df2, df1, left_on='Report Time', right_on='Time')
.groupby('Time')
.agg({'x': 'sum'})
.reindex(df1['Time'], fill_value=0)
.reset_index()
)
Output:
Time x
0 2009-09-13 01:17:00 1
1 2009-09-13 02:17:00 28
2 2009-09-13 03:17:00 0
3 2009-09-13 04:17:00 0
4 2009-09-13 05:17:00 100