How can I average the values in input2 based on the overlap between datetime in input2 with input1 datetime window.
input1
2013-01-01 02:00:00 2013-01-02 03:00:00
2013-01-01 04:00:00 2013-01-01 04:30:00
2013-01-02 04:00:00 2013-01-01 04:30:00
input2
2013-01-01 02:01:00 10
2013-01-01 02:02:00 20
2013-01-01 02:03:00 30
2013-01-01 02:04:00 40
2013-01-02 02:04:00 40
Expected output
2013-01-01 02:00:00 2013-01-02 03:00:00 25
CodePudding user response:
NB. I changed a bit your example for clarity.
You can use an IntervalIndex
to map date ranges for groupby.mean
:
idx = pd.IntervalIndex.from_arrays(pd.to_datetime(df1['date1']),
pd.to_datetime(df1['date2']))
mapper = pd.Series(idx, index=idx)
df2.groupby(mapper[pd.to_datetime(df2['date'])].values)['value'].mean()
output:
(2013-01-01 02:00:00, 2013-01-02 03:00:00] 25.0
(2013-01-02 04:00:00, 2013-01-02 04:30:00] 40.0
Name: value, dtype: float64
Used inputs:
# df1
date1 date2
0 2013-01-01 02:00:00 2013-01-02 03:00:00
1 2013-01-01 04:00:00 2013-01-01 04:30:00
2 2013-01-02 04:00:00 2013-01-02 04:30:00
# df2
date value
0 2013-01-01 02:01:00 10
1 2013-01-01 02:02:00 20
2 2013-01-01 02:03:00 30
3 2013-01-01 02:04:00 40
4 2013-01-02 04:04:00 40