I have 2 dataframes I wish to merge:
df1 looks like this:
Date Col1 Col 2 Col 3 Col 4
2016-03 27.57 0.93 28.7 1.57
2016-04 25.83 0.23 28.34 0.84
2016-05 24.55 0.27 27.11 0.03
df2 looks like this:
Date ColA
2016-03-21 7.640769230769231
2016-03-22 7.739720279720279
2016-03-23 7.577311827956988
2016-03-24 7.745416666666666
As you can see, df1 is a monthly data and df2 is a daily data. However, I want to merge them in a daily format (following df2) but I also want df1 to be lagged (lag = -30)
This is my desired output:
Output:
Date ColA Col1 Col 2 Col 3 Col 4
2016-03-21 7.640769230769231 25.83 0.23 28.34 0.84
2016-03-22 7.739720279720279 25.83 0.23 28.34 0.84
2016-03-23 7.577311827956988 25.83 0.23 28.34 0.84
2016-03-24 7.745416666666666 25.83 0.23 28.34 0.84
....2016-04-01 xxxxxxxx 24.55 0.27 27.11 0.03
I tried this but, they just merge and the lags were not applied.
out = (df2.merge(df1.shift(-30), on='Date').axis=1)
CodePudding user response:
IIUC, you can use:
df1['Date'] = pd.to_datetime(df1['Date'])
df2['Date'] = pd.to_datetime(df2['Date'])
out = (df2.merge(df1.assign(Date=df1['Date'].sub(pd.DateOffset(months=1))
.dt.to_period('M')),
left_on=df2['Date'].dt.to_period('M'), right_on='Date',
how='left', suffixes=(None, '_df1'))
)
or:
out = (df2.merge(df1.assign(Date=df1['Date'].dt.to_period('M').sub(1)),
left_on=df2['Date'].dt.to_period('M'), right_on='Date',
how='left', suffixes=(None, '_df1'))
)
output:
Date ColA Date_df1 Col1 Col 2 Col 3 Col 4
0 2016-03-21 7.640769 2016-03 25.83 0.23 28.34 0.84
1 2016-03-22 7.739720 2016-03 25.83 0.23 28.34 0.84
2 2016-03-23 7.577312 2016-03 25.83 0.23 28.34 0.84
3 2016-03-24 7.745417 2016-03 25.83 0.23 28.34 0.84
CodePudding user response:
If lag 30 means previous month you can create month periods with subtract 1
and then for merge use merge_asof
:
df1['Date'] = pd.to_datetime(df1['Date']).dt.to_period('M').sub(1).dt.to_timestamp()
df2['Date'] = pd.to_datetime(df2['Date'])
df = pd.merge_asof(df2, df1, on='Date')
print (df)
Date ColA Col1 Col 2 Col 3 Col 4
0 2016-03-21 7.640769 25.83 0.23 28.34 0.84
1 2016-03-22 7.739720 25.83 0.23 28.34 0.84
2 2016-03-23 7.577312 25.83 0.23 28.34 0.84
3 2016-03-24 7.745417 25.83 0.23 28.34 0.84