I have the following dataframes:
assets = pd.DataFrame(columns = ['date','asset'], data = [[datetime.date(2022,10,21),'SPY'],[datetime.date(2022,10,21),'FTSE'], [datetime.date(2022,11,12),'SPY'],[datetime.date(2022,11,12),'FTSE']])
prices = pd.DataFrame(columns = ['date','asset', 'price'], data = [[datetime.date(2022,10,11),'SPY',10],[datetime.date(2022,10,11),'FTSE',5],[datetime.date(2022,11,8),'SPY',100],[datetime.date(2022,11,8),'FTSE',50]]
for each asset
I want to get the as-of price
(at the nearest date
). How to do so?
If I have only one asset it is easy:
assets_spy = assets#.loc[assets['asset']=='SPY']
prices_spy = prices#.loc[prices['asset']=='SPY']
assets_spy.index = pd.to_datetime(assets_spy['date'])
prices_spy.index = pd.to_datetime(prices_spy['date'])
merged = pd.merge_asof(assets_spy.sort_index(),
prices_spy.sort_index(),
direction='nearest',right_index=True,left_index=True)
but if I follow the same logic for multiple assets, it won't match.
CodePudding user response:
The function pandas.merge_asof
has an optional parameter named by
that you can use for matching using a column or list of columns before performing merge operation. Therefore, you could adapt your code like this:
import pandas as pd
import datetime
assets = pd.DataFrame(columns = ['date', 'asset'],
data = [[datetime.date(2022, 10, 21), 'SPY'],
[datetime.date(2022, 10, 21), 'FTSE'],
[datetime.date(2022, 11, 12), 'SPY'],
[datetime.date(2022, 11, 12), 'FTSE']])
prices = pd.DataFrame(columns = ['date', 'asset', 'price'],
data = [[datetime.date(2022, 10, 11), 'SPY', 10],
[datetime.date(2022, 10, 11), 'FTSE', 5],
[datetime.date(2022, 11, 8), 'SPY', 100],
[datetime.date(2022, 11, 8), 'FTSE', 50]])
merged = pd.merge_asof(
assets.astype({'date': 'datetime64[ns]'}).sort_values('date').convert_dtypes(),
prices.astype({'date': 'datetime64[ns]'}).sort_values('date').convert_dtypes(),
direction = 'nearest',
on = 'date',
by = 'asset',
)
merged
# Returns:
#
# date asset price
# 0 2022-10-21 SPY 10 <-- Merged using SPY price from '2022-10-11'
# 1 2022-10-21 FTSE 5 <-- Merged using FTSE price from '2022-10-11'
# 2 2022-11-12 SPY 100 <-- Merged using SPY price from '2022-11-08'
# 3 2022-11-12 FTSE 50 <-- Merged using FTSE price from '2022-11-08'
Output Screenshot: