I'm testing my stock trading logic and I made a position column to check the buying / selling signal
df = pd.DataFrame({'position': [1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -1.0, 1.0, 0.0, -1.0, -1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]})
I want to replace 1.0 value occurs between 1.0 and -1.0 with 0.0, and replace -1.0 value occurs between -1.0 and 1.0 with 0.0
here is the desired output:
df = pd.DataFrame({'position': [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -1.0, 1.0, 0.0, -1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0]})
NOTE: the output only keeps the initial signal of 1.0 or -1.0
CodePudding user response:
Here is a basic implementation based on the approach described by the previous answer:
lastseen = 0
for n,el in enumerate(df["position"]):
if lastseen == 0 and el == -1:
raise Exception("Inconsistent data")
if (el in [1, -1] and el != lastseen) or lastseen == 0:
lastseen = el
else:
df["position"][n] = 0
I added the first check by considering the domain you described. If it's not correct for your problem feel free to remove it
CodePudding user response:
can you show us what you tried to do and didn't work, so we can help? anyway, maybe start with a simple solution:
- loop over the array
- keep track of what you saw most recently: -1 or 1
- change every entry that matches the most recent
- deal with edge cases (eg. loop only from first non 0 to last non 0)
CodePudding user response:
Vectorized solution that uses the capabilities of Pandas in full:
s = pd.Series([1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, -1.0, 1.0, 0.0, -1.0, -1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0])
s_ = s.mask(s == 0).ffill()
result = s.where(s_ != s_.shift(), 0)
print(pd.DataFrame({'input': s, 'result': result}))
Output:
input result
0 1.0 1.0
1 0.0 0.0
2 0.0 0.0
3 1.0 0.0
4 0.0 0.0
5 0.0 0.0
6 0.0 0.0
7 0.0 0.0
8 0.0 0.0
9 0.0 0.0
10 -1.0 -1.0
11 1.0 1.0
12 0.0 0.0
13 -1.0 -1.0
14 -1.0 0.0
15 0.0 0.0
16 0.0 0.0
17 0.0 0.0
18 0.0 0.0
19 0.0 0.0
20 1.0 1.0