I have got the below 2 df:
lst=[['2021-01-01','A'],['2021-01-01','B'],['2021-02-01','A'],['2021-02-01','B'],['2021-03-01','A'],['2021-03-01','B']]
df1=pd.DataFrame(lst,columns=['Date','Pf'])
lst=[['2021-02-01','A','New']]
df22=pd.DataFrame(lst,columns=['Date','Pf','Status'])
I would like to merge them in order to obtain the df below:
lst=[['2021-01-01','A','NaN'],['2021-01-01','B','NaN'],['2021-02-01','A','New'],['2021-02-01','B','NaN'],['2021-03-01','A','New'],['2021-03-01','B','NaN']]
df3=pd.DataFrame(lst,columns=['Date','Pf','Status'])
For the period 2021-02-01 one could apply the merge formula. However, I would like to get the same status "New" as soon the same Pf as in df2 appears by changing dates equal and bigger than 2021-02-01
Do you have any idea how I could solve this question? Thank you for your help
CodePudding user response:
Use merge_asof
with default direction='backward'
:
df1['Date'] = pd.to_datetime(df1['Date'])
df22['Date'] = pd.to_datetime(df22['Date'])
df = pd.merge_asof(df1, df22, on='Date', by='Pf')
print (df)
Date Pf Status
0 2021-01-01 A NaN
1 2021-01-01 B NaN
2 2021-02-01 A New
3 2021-02-01 B NaN
4 2021-03-01 A New
5 2021-03-01 B NaN