Home > Software engineering >  Find most recent date from different dataframe
Find most recent date from different dataframe

Time:06-17

I have a data frame (df1) and want to get a previous most recent survey_date for the ID and associated score from another data frame (df2)


df1 = pd.DataFrame({'ID' : [1,2],
                   'start_date':['2018-08-04','2018-08-09']})
df1


df2 = pd.DataFrame({'ID' : [1,1,2,2],
                   'survey_date':['2018-08-01','2018-08-05','2018-08-08','2018-08-10'],
                   'score':[200,100, 400, 800]})
df2 

desired output

ID start date prev_survey_date score
1 2018-08-04 2018-08-01 200
2 2018-08-09 2018-08-08 400

How can I do this in python?

CodePudding user response:

You can try merge_asof

#df1.start_date = pd.to_datetime(df1.start_date)

#df2.survey_date = pd.to_datetime(df2.survey_date)

out = pd.merge_asof(df1, df2, by = 'ID', left_on = 'start_date', right_on = 'survey_date')
Out[366]: 
   ID start_date survey_date  score
0   1 2018-08-04  2018-08-01    200
1   2 2018-08-09  2018-08-08    400
  • Related