I have df_deals as deals with time and prices:
df_deals = pd.DataFrame({'time': ['2022-07-29 13:36:16.059565 00:00',
'2022-07-29 13:36:49.843361 00:00'],
'price': [592, 593]})
df_deals = df_deals.set_index('time')
|time |price |
|----------------------------------|-----------|
|2022-07-29 13:36:16.059565 00:00 |592 |
|2022-07-29 13:36:49.843361 00:00 |593 |
and I have df_depths as depths with 5 seconds interval:
df_depths = pd.DataFrame({'time': ['2022-07-29 13:36:10.059565 00:00',
'2022-07-29 13:36:15.059565 00:00',
'2022-07-29 13:36:45.059565 00:00',
'2022-07-29 13:36:50.059565 00:00'],
'asks': [594, 595, 596, 597],
'bids': [591, 950, 592, 590]})
df_depths = df_depths.set_index('time')
|time |asks |bids|
|-----------------------------------|-------|----|
|2022-07-29 13:36:10.059565 00:00 |594 |591 |
|2022-07-29 13:36:15.059565 00:00 |595 |590 |
|2022-07-29 13:36:45.059565 00:00 |596 |592 |
|2022-07-29 13:36:50.059565 00:00 |597 |590 |
I would like to have as the result next DataFrame, where ask and bid valid values at the time of the transaction:
|time |price | ask | bid|
|-----------------------------------|-------|-----|----|
|2022-07-29 13:36:16.059565 00:00 |592 |595 |590 |
|2022-07-29 13:36:49.843361 00:00 |593 |596 |592 |
I don't have any idea how to do this by pandas.
CodePudding user response:
You can use merge_asof
:
Note: do not set_index('time')
for your dataframes before this:
df_deals['time'] = pd.to_datetime(df_deals['time'])
df_depths['time'] = pd.to_datetime(df_depths['time'])
df_out = pd.merge_asof(df_deals, df_depths, on='time').set_index('time')
output:
price asks bids
time
2022-07-29 13:36:16.059565 00:00 592 595 950
2022-07-29 13:36:49.843361 00:00 593 596 592
CodePudding user response:
Can be done with pandas.merge_asof
:
# dataframe df_deals
df_deals = pd.DataFrame({'time': ['2022-07-29 13:36:16.059565 00:00', '2022-07-29 13:36:49.843361 00:00'], 'price': [592, 593]})
df_deals.time = pd.to_datetime(df_deals.time)
df_deals = df_deals.set_index('time')
# dataframe df_depths
df_depths = pd.DataFrame({'time': ['2022-07-29 13:36:10.059565 00:00', '2022-07-29 13:36:15.059565 00:00', '2022-07-29 13:36:45.059565 00:00', '2022-07-29 13:36:50.059565 00:00'], 'asks': [594, 595, 596, 597], 'bids': [591, 950, 592, 590]})
df_depths.time = pd.to_datetime(df_depths.time)
df_depths = df_depths.set_index('time')
# merge_asof
pd.merge_asof(
df_deals,
df_depths,
left_index=True,
right_index=True,
direction='backward',
tolerance=pd.Timedelta('20 seconds')
)
Output:
price asks bids
time
2022-07-29 13:36:16.059565 00:00 592 595 950
2022-07-29 13:36:49.843361 00:00 593 596 592