I have a list of symbols in a single df:
index date symbol stock_id open high low close volume vwap
0 0 2021-10-11 BVN 13 7.69 7.98 7.5600 7.61 879710 7.782174
1 1 2021-10-12 BVN 13 7.67 8.08 7.5803 8.02 794436 7.967061
2 2 2021-10-13 BVN 13 8.12 8.36 8.0900 8.16 716012 8.231286
3 3 2021-10-14 BVN 13 8.26 8.29 8.0500 8.28 586091 8.185899
4 4 2021-10-15 BVN 13 8.18 8.44 8.0600 8.44 1278409 8.284539
... ... ... ... ... ... ... ... ... ... ...
227774 227774 2022-10-04 ERIC 11000 6.27 6.32 6.2400 6.29 14655189 6.280157
227775 227775 2022-10-05 ERIC 11000 6.17 6.31 6.1500 6.29 10569193 6.219965
227776 227776 2022-10-06 ERIC 11000 6.20 6.25 6.1800 6.22 7918812 6.217198
227777 227777 2022-10-07 ERIC 11000 6.17 6.19 6.0800 6.10 9671252 6.135976
227778 227778 2022-10-10 ERIC 11000 6.13 6.15 6.0200 6.04 6310661 6.066256
[227779 rows x 10 columns]
I want to be able to get each unique stock and return a new column of true / false depending on if the last two days were red, e.g:
index date symbol stock_id open high low close volume vwap is_downtrending
0 0 2021-10-11 BVN 13 7.69 7.98 7.5600 7.61 879710 7.782174 False
1 1 2021-10-12 BVN 13 7.67 8.08 7.5803 8.02 794436 7.967061 False
2 2 2021-10-13 BVN 13 8.12 8.36 8.0900 8.16 716012 8.231286 False
3 3 2021-10-14 BVN 13 8.26 8.29 8.0500 8.28 586091 8.185899 True
4 4 2021-10-15 BVN 13 8.18 8.44 8.0600 8.44 1278409 8.284539 True
... ... ... ... ... ... ... ... ... ... ...
227774 227774 2022-10-04 ERIC 11000 6.27 6.32 6.2400 6.29 14655189 6.280157 True
227775 227775 2022-10-05 ERIC 11000 6.17 6.31 6.1500 6.29 10569193 6.219965 True
227776 227776 2022-10-06 ERIC 11000 6.20 6.25 6.1800 6.22 7918812 6.217198 True
227777 227777 2022-10-07 ERIC 11000 6.17 6.19 6.0800 6.10 9671252 6.135976 False
227778 227778 2022-10-10 ERIC 11000 6.13 6.15 6.0200 6.04 6310661 6.066256 True
[227779 rows x 10 columns]
Something like:
ticker_group=df.groupby(['stock_id'])
final_df = pd.DataFrame()
def is_downtrending(close):
out = np.full(close.shape, False)
for i in range(close.shape[0]):
# if we've had two consecutive red days
if (close[i] < close[i - 1]) and (close[i - 1] < close[i - 2]):
out[i] = True
else:
out[i] = False
return out
for index, ohlc in ticker_group:
ohlc['is_downtrending'] = is_downtrending(ohlc['close'])
final_df = ohlc
How do I return a new column of true / false values?
CodePudding user response:
You can try a groupby
rolling
:
N = 2
df['is_downtrending'] = (df.groupby('symbol')['close']
.apply(lambda s: s.diff().lt(0).rolling(N).sum().eq(N))
)
Output:
index date symbol stock_id open high low close volume vwap is_downtrending
0 0 2021-10-11 BVN 13 7.69 7.98 7.5600 7.61 879710 7.782174 False
1 1 2021-10-12 BVN 13 7.67 8.08 7.5803 8.02 794436 7.967061 False
2 2 2021-10-13 BVN 13 8.12 8.36 8.0900 8.16 716012 8.231286 False
3 3 2021-10-14 BVN 13 8.26 8.29 8.0500 8.28 586091 8.185899 False
4 4 2021-10-15 BVN 13 8.18 8.44 8.0600 8.44 1278409 8.284539 False
227774 227774 2022-10-04 ERIC 11000 6.27 6.32 6.2400 6.29 14655189 6.280157 False
227776 227776 2022-10-06 ERIC 11000 6.20 6.25 6.1800 6.22 7918812 6.217198 False
227777 227777 2022-10-07 ERIC 11000 6.17 6.19 6.0800 6.10 9671252 6.135976 True
227778 227778 2022-10-10 ERIC 11000 6.13 6.15 6.0200 6.04 6310661 6.066256 True