Home > Enterprise >  How to pairwise the buy days with the next sell day in a financial asset DataFrame
How to pairwise the buy days with the next sell day in a financial asset DataFrame

Time:03-28

I have a DataFrame with closing prices and buy/sell signals for a financial asset. My goal is to create a new dataframe with the pairs of the buy and sell days.

enter image description here

Currently I create this new DataFrame by iterating over the original DataFrame and keeping the value and purchase day saved.

import pandas as pd

df = pd.DataFrame({
    'close': [30.0,29.39,29.24,22.2,19.01,26.9,13.92,5.05,13.11,14.94,16.33,14.57,15.91,21.06,22.05,
              24.66,18.96,6.6,5.35,7.76],
    'buy_signal': [False,False,True,False,False,False,False,False,True,True,False,False,False,False,
                   False,False,True,False,False,True],
    'sell_signal': [True,False,False,False,True,True,True,False,False,False,False,False,False,True,
                    False,False,False,False,False,False],
})

df['date'] = ['2022-02-28','2022-03-01','2022-03-02','2022-03-03','2022-03-04','2022-03-07',
              '2022-03-08','2022-03-09','2022-03-10','2022-03-11','2022-03-14','2022-03-15',
              '2022-03-16','2022-03-17','2022-03-18','2022-03-21','2022-03-22','2022-03-23',
              '2022-03-24','2022-03-25',]

df = df.set_index('date')



def get_positions(dt):
    positions = {
        'buy_price': [],
        'sell_price': [],
        
        'buy_date': [],
        'sell_date': [],
    }

    buying = False

    for row in df.itertuples():
        if buying is False and row.buy_signal is True:
            buying = True
            positions['buy_date'].append(row.Index)
            positions['buy_price'].append(row.close)
        
        if buying is True and row.sell_signal is True:
            buying = False
            positions['sell_date'].append(row.Index)
            positions['sell_price'].append(row.close)


    positions['buy_price'] = positions['buy_price'][:len(positions['sell_price'])]
    positions['buy_date'] = positions['buy_date'][:len(positions['sell_date'])]

    positions = pd.DataFrame(positions)
    positions['profit'] = positions['sell_price'] - positions['buy_price']

    return positions


positions = get_positions(df)
positions

As much as this approach works, I've found that iterating over a DataFrame is an anti-pattern and a very slow routine for very large DataFrames.

So I would like to know if there is another way to do these buy and sell day pairs.

CodePudding user response:

I think you can split the dataframe into the one with sell (df_sell in below code) and buy (df_buy in below code) signal and merge them using pd.merge_asof with forward direction and then filter out the rows with NaN.

def get_positions(df):
    df.index = pd.to_datetime(df.index)
    df['date_col'] = df.index
    df_buy = df.loc[df['buy_signal'] == True]
    df_sell = df.loc[df['sell_signal'] == True]

    df_positions = pd.merge_asof(left=df_buy, right=df_sell, right_index=True, left_index=True, direction='forward')
    df_positions.drop_duplicates(subset=['date_col_y'], keep='first', inplace=True)
    df_positions.dropna(inplace=True)


    positions = pd.DataFrame({
        'buy_price': df_positions['close_x'],
        'sell_price': df_positions['close_y'],
        'buy_date': df_positions['date_col_x'],
        'sell_date': df_positions['date_col_y'],
        'profit': df_positions['close_y'] - df_positions['close_x'] })

    return positions

If you also want to keep the buy dates that share them same sell date as the date before (2022-03-11 in your example data), you can remove the line

df_positions.drop_duplicates(subset=['date_col_y'], keep='first', inplace=True)
  • Related