I have two dataframes.
df1=pandas.DataFrame([1,2,3],index=['2020-12-31 23:00:00','2021-01-01 00:00:00','2021-01-01 01:00:00'])
df2=pandas.DataFrame([['2020-01-01 00:00:00','2021-01-01 00:00:00',70],['2021-01-01 00:00:00','2022-01-01 00:00:00',65]],columns=['From','To','Value'])
I then want to get this:
pandas.DataFrame([[1,70],[2,65],[3,65]],index=['2020-12-31 23:00:00','2021-01-01 00:00:00','2021-01-01 01:00:00'])
In other words for all dates in df1.index in want to get the 'Value' from df2 based on the condition that df1.index>=df2[From] & df1.index<df2[To]
Thanks
CodePudding user response:
Use pd.IntervalIndex
import pandas as pd
df1=pd.DataFrame([1,2,3],index=['2020-12-31 23:00:00','2021-01-01
00:00:00','2021-01-01 01:00:00'])
df2=pd.DataFrame([['2020-01-01 00:00:00','2021-01-01 00:00:00',70],['2021-01-01 00:00:00','2022-01-01 00:00:00',65]],columns=['From','To','Value'])
df1.index = pd.to_datetime(df1.index)
df2.From = pd.to_datetime(df2.From)
df2.To = pd.to_datetime(df2.To)
df2.index = pd.IntervalIndex.from_arrays(df2.From,df2.To,"left")
df1[1] = df2.loc[df1.index].Value.to_numpy()
df1
Output:
0 1
2020-12-31 23:00:00 1 70
2021-01-01 00:00:00 2 65
2021-01-01 01:00:00 3 65