Home > Back-end >  pandas time frames issue
pandas time frames issue

Time:08-03

I have df_deals as deals with time and prices:

df_deals = pd.DataFrame({'time': ['2022-07-29 13:36:16.059565 00:00', 
                                  '2022-07-29 13:36:49.843361 00:00'],
                             'price': [592, 593]})
    df_deals = df_deals.set_index('time')


|time                              |price      |
|----------------------------------|-----------|
|2022-07-29 13:36:16.059565 00:00  |592        |
|2022-07-29 13:36:49.843361 00:00  |593        |

and I have df_depths as depths with 5 seconds interval:

df_depths = pd.DataFrame({'time': ['2022-07-29 13:36:10.059565 00:00', 
                                   '2022-07-29 13:36:15.059565 00:00', 
                                   '2022-07-29 13:36:45.059565 00:00', 
                                   '2022-07-29 13:36:50.059565 00:00'],
                         'asks': [594, 595, 596, 597],
                         'bids': [591, 950, 592, 590]})

df_depths = df_depths.set_index('time')


|time                               |asks   |bids|
|-----------------------------------|-------|----|
|2022-07-29 13:36:10.059565 00:00   |594    |591 |
|2022-07-29 13:36:15.059565 00:00   |595    |590 |
|2022-07-29 13:36:45.059565 00:00   |596    |592 |
|2022-07-29 13:36:50.059565 00:00   |597    |590 |

I would like to have as the result next DataFrame, where ask and bid valid values at the time of the transaction:

|time                               |price  | ask | bid|
|-----------------------------------|-------|-----|----|
|2022-07-29 13:36:16.059565 00:00   |592    |595  |590 |
|2022-07-29 13:36:49.843361 00:00   |593    |596  |592 |

I don't have any idea how to do this by pandas.

CodePudding user response:

You can use merge_asof:

Note: do not set_index('time') for your dataframes before this:

df_deals['time'] = pd.to_datetime(df_deals['time'])
df_depths['time'] = pd.to_datetime(df_depths['time'])
df_out = pd.merge_asof(df_deals, df_depths, on='time').set_index('time')

output:

                                  price  asks  bids
time                                               
2022-07-29 13:36:16.059565 00:00    592   595   950
2022-07-29 13:36:49.843361 00:00    593   596   592

CodePudding user response:

Can be done with pandas.merge_asof:

# dataframe df_deals
df_deals = pd.DataFrame({'time': ['2022-07-29 13:36:16.059565 00:00', '2022-07-29 13:36:49.843361 00:00'], 'price': [592, 593]})
df_deals.time = pd.to_datetime(df_deals.time)
df_deals = df_deals.set_index('time')

# dataframe df_depths
df_depths = pd.DataFrame({'time': ['2022-07-29 13:36:10.059565 00:00', '2022-07-29 13:36:15.059565 00:00', '2022-07-29 13:36:45.059565 00:00', '2022-07-29 13:36:50.059565 00:00'], 'asks': [594, 595, 596, 597], 'bids': [591, 950, 592, 590]})
df_depths.time = pd.to_datetime(df_depths.time)
df_depths = df_depths.set_index('time')

# merge_asof
pd.merge_asof(
    df_deals, 
    df_depths, 
    left_index=True, 
    right_index=True, 
    direction='backward', 
    tolerance=pd.Timedelta('20 seconds')
)

Output:

                                  price  asks  bids
time                                               
2022-07-29 13:36:16.059565 00:00    592   595   950
2022-07-29 13:36:49.843361 00:00    593   596   592
  • Related