Home > Enterprise >  How to remove duplicated buy signal and keep initial signal in pandas dataframe?
How to remove duplicated buy signal and keep initial signal in pandas dataframe?

Time:01-29

I'm backtesting my trading strategy in pandas dataframe

Here is the sample dataframe:

open    high    low close   long    short   position_long
datetime

                        
2023-01-07 16:00:00 16940.60    16940.61    16916.00    16922.79    False   False   0
2023-01-07 17:00:00 16922.79    16935.87    16918.85    16928.85    False   False   0
2023-01-07 18:00:00 16928.85    16940.00    16928.22    16938.12    False   True    -1
2023-01-07 19:00:00 16938.12    16946.13    16935.23    16944.05    False   True    -1
2023-01-07 20:00:00 16944.05    16951.92    16937.04    16945.77    False   False   0
2023-01-07 21:00:00 16945.44    16950.25    16937.00    16939.31    True    False   1
2023-01-07 22:00:00 16939.06    16941.69    16929.46    16936.31    False   False   0
2023-01-07 23:00:00 16936.31    16948.04    16931.05    16943.57    False   False   0
2023-01-08 00:00:00 16943.83    16955.00    16911.00    16926.66    False   False   0
2023-01-08 01:00:00 16926.66    16936.00    16915.25    16929.82    False   False   0
2023-01-08 02:00:00 16929.82    16934.00    16919.07    16931.18    False   True    -1
2023-01-08 03:00:00 16931.46    16944.62    16926.17    16944.02    False   True    -1
2023-01-08 04:00:00 16944.02    16944.99    16934.40    16934.75    True    False   1
2023-01-08 05:00:00 16934.95    16944.21    16933.54    16943.73    False   True    -1
2023-01-08 06:00:00 16943.73    16951.97    16933.26    16950.92    False   True    -1
2023-01-08 07:00:00 16950.92    16957.12    16943.16    16953.77    False   False   0
2023-01-08 08:00:00 16953.77    16956.01    16940.01    16946.07    False   False   0
2023-01-08 09:00:00 16945.91    16967.67    16925.05    16935.60    False   False   0
2023-01-08 10:00:00 16935.60    16940.00    16933.14    16936.41    False   False   0
2023-01-08 11:00:00 16936.41    16954.00    16926.30    16927.42    True    False   1
2023-01-08 12:00:00 16927.42    16932.51    16921.40    16923.75    False   False   0
2023-01-08 13:00:00 16924.01    16938.49    16920.37    16933.03    False   False   0
2023-01-08 14:00:00 16933.32    16954.30    16931.25    16952.44    False   True    -1
2023-01-08 15:00:00 16952.44    17019.79    16938.00    17000.43    False   False   0
2023-01-08 16:00:00 17000.43    17022.08    16916.10    16936.87    True    False   1
2023-01-08 17:00:00 16936.87    16952.03    16920.30    16928.50    False   False   0
2023-01-08 18:00:00 16928.89    16936.49    16919.94    16929.84    False   False   0
2023-01-08 19:00:00 16930.03    16939.45    16923.84    16927.25    False   False   0
2023-01-08 20:00:00 16927.25    16982.43    16924.41    16966.67    False   False   0
2023-01-08 21:00:00 16966.95    16988.02    16945.50    16957.46    False   False   0
2023-01-08 22:00:00 16957.28    16976.53    16952.67    16963.14    False   False   0
2023-01-08 23:00:00 16963.27    17176.99    16962.95    17127.83    False   False   0
2023-01-09 00:00:00 17127.83    17199.00    17104.66    17184.51    False   True    -1
2023-01-09 01:00:00 17185.28    17230.00    17145.52    17175.02    False   False   0
2023-01-09 02:00:00 17175.02    17252.60    17161.70    17191.47    False   False   0
2023-01-09 03:00:00 17191.47    17217.34    17176.72    17197.00    False   False   0
2023-01-09 04:00:00 17197.00    17253.19    17194.32    17226.49    False   False   0
2023-01-09 05:00:00 17227.11    17262.56    17200.60    17205.95    True    False   1
2023-01-09 06:00:00 17205.95    17220.59    17180.00    17189.24    False   False   0
2023-01-09 07:00:00 17189.25    17215.48    17183.92    17198.48    False   False   0
2023-01-09 08:00:00 17198.01    17240.00    17186.95    17191.68    True    False   1
2023-01-09 09:00:00 17191.68    17270.00    17189.53    17238.38    False   False   0
2023-01-09 10:00:00 17238.38    17285.48    17235.59    17264.50    False   False   0
2023-01-09 11:00:00 17264.10    17278.87    17236.21    17238.93    True    False   1
2023-01-09 12:00:00 17239.47    17254.97    17228.85    17249.06    False   False   0
2023-01-09 13:00:00 17249.40    17295.55    17190.74    17226.90    False   False   0
2023-01-09 14:00:00 17227.30    17276.13    17206.98    17264.25    False   True    -1

In position_long column, 1 is buy signal and -1 is sell signal. 0 is neither both.

So for instance, at 2023-01-08 03:00:00 row, selling twice in a row doesn't make sense. In this case, I want to keep the initial sell signal only and replace second signal with 0.

And at 2023-01-09 08:00:00 row and 2023-01-09 11:00:00, position_long column should be 0 since at 2023-01-09 05:00:00 buying signal is already occurred.

here is the desired output:

open    high    low close   long    short   position_long
datetime


2023-01-07 16:00:00 16940.60    16940.61    16916.00    16922.79    False   False   0
2023-01-07 17:00:00 16922.79    16935.87    16918.85    16928.85    False   False   0
2023-01-07 18:00:00 16928.85    16940.00    16928.22    16938.12    False   True    0
2023-01-07 19:00:00 16938.12    16946.13    16935.23    16944.05    False   True    0
2023-01-07 20:00:00 16944.05    16951.92    16937.04    16945.77    False   False   0
2023-01-07 21:00:00 16945.44    16950.25    16937.00    16939.31    True    False   1
2023-01-07 22:00:00 16939.06    16941.69    16929.46    16936.31    False   False   0
2023-01-07 23:00:00 16936.31    16948.04    16931.05    16943.57    False   False   0
2023-01-08 00:00:00 16943.83    16955.00    16911.00    16926.66    False   False   0
2023-01-08 01:00:00 16926.66    16936.00    16915.25    16929.82    False   False   0
2023-01-08 02:00:00 16929.82    16934.00    16919.07    16931.18    False   True    -1
2023-01-08 03:00:00 16931.46    16944.62    16926.17    16944.02    False   True    0
2023-01-08 04:00:00 16944.02    16944.99    16934.40    16934.75    True    False   1
2023-01-08 05:00:00 16934.95    16944.21    16933.54    16943.73    False   True    -1
2023-01-08 06:00:00 16943.73    16951.97    16933.26    16950.92    False   True    0
2023-01-08 07:00:00 16950.92    16957.12    16943.16    16953.77    False   False   0
2023-01-08 08:00:00 16953.77    16956.01    16940.01    16946.07    False   False   0
2023-01-08 09:00:00 16945.91    16967.67    16925.05    16935.60    False   False   0
2023-01-08 10:00:00 16935.60    16940.00    16933.14    16936.41    False   False   0
2023-01-08 11:00:00 16936.41    16954.00    16926.30    16927.42    True    False   1
2023-01-08 12:00:00 16927.42    16932.51    16921.40    16923.75    False   False   0
2023-01-08 13:00:00 16924.01    16938.49    16920.37    16933.03    False   False   0
2023-01-08 14:00:00 16933.32    16954.30    16931.25    16952.44    False   True    -1
2023-01-08 15:00:00 16952.44    17019.79    16938.00    17000.43    False   False   0
2023-01-08 16:00:00 17000.43    17022.08    16916.10    16936.87    True    False   1
2023-01-08 17:00:00 16936.87    16952.03    16920.30    16928.50    False   False   0
2023-01-08 18:00:00 16928.89    16936.49    16919.94    16929.84    False   False   0
2023-01-08 19:00:00 16930.03    16939.45    16923.84    16927.25    False   False   0
2023-01-08 20:00:00 16927.25    16982.43    16924.41    16966.67    False   False   0
2023-01-08 21:00:00 16966.95    16988.02    16945.50    16957.46    False   False   0
2023-01-08 22:00:00 16957.28    16976.53    16952.67    16963.14    False   False   0
2023-01-08 23:00:00 16963.27    17176.99    16962.95    17127.83    False   False   0
2023-01-09 00:00:00 17127.83    17199.00    17104.66    17184.51    False   True    -1
2023-01-09 01:00:00 17185.28    17230.00    17145.52    17175.02    False   False   0
2023-01-09 02:00:00 17175.02    17252.60    17161.70    17191.47    False   False   0
2023-01-09 03:00:00 17191.47    17217.34    17176.72    17197.00    False   False   0
2023-01-09 04:00:00 17197.00    17253.19    17194.32    17226.49    False   False   0
2023-01-09 05:00:00 17227.11    17262.56    17200.60    17205.95    True    False   1
2023-01-09 06:00:00 17205.95    17220.59    17180.00    17189.24    False   False   0
2023-01-09 07:00:00 17189.25    17215.48    17183.92    17198.48    False   False   0
2023-01-09 08:00:00 17198.01    17240.00    17186.95    17191.68    True    False   0
2023-01-09 09:00:00 17191.68    17270.00    17189.53    17238.38    False   False   0
2023-01-09 10:00:00 17238.38    17285.48    17235.59    17264.50    False   False   0
2023-01-09 11:00:00 17264.10    17278.87    17236.21    17238.93    True    False   0
2023-01-09 12:00:00 17239.47    17254.97    17228.85    17249.06    False   False   0
2023-01-09 13:00:00 17249.40    17295.55    17190.74    17226.90    False   False   0
2023-01-09 14:00:00 17227.30    17276.13    17206.98    17264.25    False   True    -1

And here is the reproducible sample code for test:

import pandas as pd

data = {
    'open': [16940.60, 16922.79, 16928.85, 16938.12, 16944.05, 16945.44, 16939.06, 16936.31, 16943.83, 16926.66, 16929.82, 16931.46, 16944.02, 16934.95, 16943.73],
    'high': [16940.61, 16935.87, 16940.00, 16946.13, 16951.92, 16950.25, 16941.69, 16948.04, 16955.00, 16936.00, 16934.00, 16944.62, 16944.99, 16944.21, 16951.97],
    'low': [16916.00, 16918.85, 16928.22, 16935.23, 16937.04, 16937.00, 16929.46, 16931.05, 16911.00, 16915.25, 16919.07, 16926.17, 16934.40, 16933.54, 16933.26],
    'close': [16922.79, 16928.85, 16938.12, 16944.05, 16945.77, 16939.31, 16936.31, 16943.57, 16926.66, 16929.82, 16931.18, 16944.02, 16934.75, 16943.73, 16950.92],
    'long': [False, False, False, False, False, True, False, False, False, False, False, False, True, False, False],
    'short': [False, False, True, True, False, False, False, False, False, False, True, True, False, True, True],
    'position_long': [0, 0, -1, -1, 0, 1, 0, 0, 0, 0, -1, -1, 1, -1, -1]
}

df = pd.DataFrame(data, columns=['open', 'high', 'low', 'close', 'long', 'short', 'position_long'])
df['datetime'] = pd.date_range(start='2023-01-07 16:00:00', periods=15, freq='1H')
df = df[['datetime', 'open', 'high', 'low', 'close', 'long', 'short', 'position_long']]

print(df)

CodePudding user response:

How about forward filling the zeros such that you have full sequences of alternating -1, 1, and then only keeping the first of each such sequence:

import pandas as pd

s = pd.Series([0, 0, -1, -1, 0, 1, 0, 0, 0, 0, -1, -1, 1, -1, -1])

filled = s.mask(s == 0).ffill().fillna(0)

out = filled.where(filled != filled.shift(), 0)

out:

>>> out
0     0.0
1     0.0
2    -1.0
3     0.0
4     0.0
5     1.0
6     0.0
7     0.0
8     0.0
9     0.0
10   -1.0
11    0.0
12    1.0
13   -1.0
14    0.0
dtype: float64
  • Related