Home > database >  check if dataframe column is increasing strictly
check if dataframe column is increasing strictly

Time:11-09

I am fetching timeseries data from crypto exchange and store them in a dataframe. I want to use this dataframe in backtesting and live trading. dataframe has a "close" price column, I want to check if the last 4 "close" prices are strictly increasing.

So if the given dataframe is:

index  time           open     high     low      close    volume
1      1618618500000  61648.9  61695.3  61188.4  61333.2  72.375605
2      1618619400000  61333.1  61396.4  61144.2  61200.0  52.882392
3      1618620300000  61200.0  61509.4  61199.9  61446.2  48.429485
4      1618621200000  61446.2  61764.7  61446.2  61647.4  83.822974
5      1635909300000  63006.2  63087.2  62935.0  63081.9  35.265568
6      1635910200000  63081.9  63214.5  62950.1  63084.0  41.213263
7      1635911100000  63084.0  63236.0  63027.6  63213.9  32.429295
8      1635912000000  63213.8  63213.8  63021.5  63024.1  47.032509
9      1635912900000  63024.1  63091.4  62852.1  62970.7  84.098123
10     1635904800000  63133.8  63133.8  62744.1  62874.7  85.604461
11     1635905700000  62874.7  62970.8  62853.0  62945.5  56.390176
12     1635906600000  62942.1  63089.9  62935.0  63089.9  44.340149
13     1635907500000  63089.9  63217.0  63013.6  63156.2  50.565914
14     1635908400000  63156.2  63156.2  62994.7  63006.2  60.634036
15     1635909300000  63006.2  63087.2  62935.0  63081.9  35.265568
16     1635910200000  63081.9  63214.5  62950.1  63084.0  41.213263
17     1635911100000  63084.0  63236.0  63027.6  63213.9  32.429295
18     1635912000000  63213.8  63213.8  63021.5  63024.1  47.032509
19     1635912900000  63024.1  63091.4  62852.1  62970.7  84.098123

Then I want to make

index  time           open     high     low      close    volume       monotonic_inc
1      1618618500000  61648.9  61695.3  61188.4  61333.2  72.375605    NAN
2      1618619400000  61333.1  61396.4  61144.2  61200.0  52.882392    NAN 
3      1618620300000  61200.0  61509.4  61199.9  61446.2  48.429485    NAN
4      1618621200000  61446.2  61764.7  61446.2  61647.4  83.822974    NAN
5      1635909300000  63006.2  63087.2  62935.0  63081.9  35.265568    False
6      1635910200000  63081.9  63214.5  62950.1  63084.0  41.213263    True
7      1635911100000  63084.0  63236.0  63027.6  63213.9  32.429295    True
8      1635912000000  63213.8  63213.8  63021.5  63024.1  47.032509    False
9      1635912900000  63024.1  63091.4  62852.1  62970.7  84.098123    False
10     1635904800000  63133.8  63133.8  62744.1  62874.7  85.604461    False
11     1635905700000  62874.7  62970.8  62853.0  62945.5  56.390176    False
12     1635906600000  62942.1  63089.9  62935.0  63089.9  44.340149    False
13     1635907500000  63089.9  63217.0  63013.6  63156.2  50.565914    True
14     1635908400000  63156.2  63156.2  62994.7  63006.2  60.634036    False
15     1635909300000  63006.2  63087.2  62935.0  63081.9  35.265568    False
16     1635910200000  63081.9  63214.5  62950.1  63084.0  41.213263    False
17     1635911100000  63084.0  63236.0  63027.6  63213.9  32.429295    True
18     1635912000000  63213.8  63213.8  63021.5  63024.1  47.032509    False
19     1635912900000  63024.1  63091.4  62852.1  62970.7  84.098123    False

I do not want to use .iloc or any index referencing and prefer to use vectorized operation Since I want to use my program in a live trading environment, speed and memory efficient is important.

Thanks for your time, appreciate your effort.

I tested a simple version to check with previous closed price:

dataframe['increasing'] = dataframe.apply(lambda x : 1 if dataframe['close'] > dataframe['close'].shift() else 0)

but this returned an error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

CodePudding user response:

You could compute the diff and use a rolling/sum on the booleans:

df['increasing'] = df['close'].diff().gt(0).rolling(3).sum().eq(3)

NB. I am not sure of your example output

CodePudding user response:

You could use a window function and check for monotonicity on a window -

def check_list_monotonic_increase(lst):
    lst = list(lst)
    return all(lst[i] < lst[i   1] for i in range(len(lst) - 1))

s = pd.Series(np.random.rand(10))
#0    0.963020
#1    0.121435
#2    0.694150
#3    0.269287
#4    0.390505
#5    0.641838
#6    0.771216
#7    0.909652
#8    0.791609
#9    0.382845

s.rolling(4).agg(check_list_monotonic_increase)
#0    NaN
#1    NaN
#2    NaN
#3    0.0
#4    0.0
#5    0.0
#6    1.0
#7    1.0
#8    0.0
#9    0.0

CodePudding user response:

df['increasing'] = np.where(df['close'] > df['close'].shift(), 1, 0)

This code will create a new column called "increasing". It will indicate 1 where today's "close" price is greater than yesterday's "close" price, and if not, indicate 0.

From the numpy.where documentation:

Where True, yield x, otherwise yield y.

  • Related