I have a pandas series of m*n elements of the following form where m=5 and n=3 :
A: [1 1 1 1 1 0 1 1 0 0 0 0 0 1 1]
I need a result series as follows :
B: [1 0 0 0 0 0 1 0 0 0 0 0 0 1 0]
m and n can be any values.
I also have supplemental data that might help.
At least one supplemental data is as follows :
HORIZON: [0 1 2 3 4 0 1 2 3 4 0 1 2 3 4]
The original series of 0 and 1 values can be derived from real data which is :
CUSIP: [CUSIP1 CUSIP1 CUSIP1 CUSIP1 CUSIP1 np.nan CUSIP2 CUSIP2 ... CUSIP3]
What have i thought about so far: Shifting the series A right and xor with A. But this idea doesnt seem to be leading anywhere since there are edge cases it would'nt solve anyway.
Its pretty straightforward using a standard for loop, but we have shifted to vectorized operations pretty much, so I really would prefer a vectorized way to do this.
thanks.
EDIT:
The solution proposed works and the result is :
A: [1 1 1 1 1 0 1 1 0 0 0 0 0 1 1]
A': [nan 1 1 1 1 1 0 1 1 0 0 0 0 0 1] (A shifted)
A.where(A.ne(A.shift()) & A.eq(1),0)
B : [1 0 0 0 0 0 1 0 0 0 0 0 0 1 0]
FURTHER EDIT:
There is an edge case for which the solution doesnt work. Modified solution is :
a = pandas.Series([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
x = a.shift()
x.iloc[::5] = numpy.nan
b = a.where(a.ne(x) & a.eq(1),0)
CodePudding user response:
You can do shift
with eq
a.where(a.ne(a.shift()) & a.eq(1),0)
Out[32]:
0 1
1 0
2 0
3 0
4 0
5 0
6 1
7 0
8 0
9 0
10 0
11 0
12 0
13 1
14 0
dtype: int64