I have two dataframes which are:
Value
Date
2010-06-29 3
2010-06-30 1
2010-07-01 5
2010-07-02 4
2010-07-03 9
2010-07-04 7
2010-07-05 2
2010-07-06 3
Value
Date
2010-06-29 6
2010-07-03 1
2010-07-06 4
The first dataframe could be created with the Python code:
import pandas as pd
df = pd.DataFrame(
{
'Date': ['2010-06-29', '2010-06-30', '2010-07-01', '2010-07-02', '2010-07-03', '2010-07-04', '2010-07-05', '2010-07-06'],
'Value': [3, 1, 5, 4, 9, 7, 2, 3]
}
)
df['Date'] = pd.to_datetime(df['Date']).dt.date
df = df.set_index('Date')
and the second dataframe:
df2 = pd.DataFrame(
{
'Date': ['2010-06-29', '2010-07-03', '2010-07-06'],
'Value': [6, 1, 4]
}
)
df2['Date'] = pd.to_datetime(df2['Date']).dt.date
df2 = df2.set_index('Date')
I want to create a second column in the first dataframe and the value of each Date in the new column will be the value of the first Date in the second dataframe equal to or earlier than the Date in the first dataframe.
So, the output is:
Value Value_2
Date
2010-06-29 3 6
2010-06-30 1 6
2010-07-01 5 6
2010-07-02 4 6
2010-07-03 9 1
2010-07-04 7 1
2010-07-05 2 1
2010-07-06 3 4
Also, it is my priority not to use any for-loops for the code.
How can I do this?
CodePudding user response:
pd.merge_asof
should suffice for this
df.index = pd.to_datetime(df.index)
df2.index = pd.to_datetime(df2.index)
pd.merge_asof(df, df2, on='Date')
Date Value_x Value_y
0 2010-06-29 3 6
1 2010-06-30 1 6
2 2010-07-01 5 6
3 2010-07-02 4 6
4 2010-07-03 9 1
5 2010-07-04 7 1
6 2010-07-05 2 1
7 2010-07-06 3 4