Home > Mobile >  Merge dataframes by dates
Merge dataframes by dates

Time:03-04

I have got the below 2 df:

lst=[['2021-01-01','A'],['2021-01-01','B'],['2021-02-01','A'],['2021-02-01','B'],['2021-03-01','A'],['2021-03-01','B']]
df1=pd.DataFrame(lst,columns=['Date','Pf'])

lst=[['2021-02-01','A','New']]
df22=pd.DataFrame(lst,columns=['Date','Pf','Status'])

I would like to merge them in order to obtain the df below:

lst=[['2021-01-01','A','NaN'],['2021-01-01','B','NaN'],['2021-02-01','A','New'],['2021-02-01','B','NaN'],['2021-03-01','A','New'],['2021-03-01','B','NaN']]
df3=pd.DataFrame(lst,columns=['Date','Pf','Status'])

For the period 2021-02-01 one could apply the merge formula. However, I would like to get the same status "New" as soon the same Pf as in df2 appears by changing dates equal and bigger than 2021-02-01

Do you have any idea how I could solve this question? Thank you for your help

CodePudding user response:

Use merge_asof with default direction='backward':

df1['Date'] = pd.to_datetime(df1['Date'])
df22['Date'] = pd.to_datetime(df22['Date'])

df = pd.merge_asof(df1, df22, on='Date', by='Pf')
print (df)
        Date Pf Status
0 2021-01-01  A    NaN
1 2021-01-01  B    NaN
2 2021-02-01  A    New
3 2021-02-01  B    NaN
4 2021-03-01  A    New
5 2021-03-01  B    NaN
  • Related