Home > Blockchain >  Pandas vectorized way to tag first occurring value(m elements) in a series of m*n elements
Pandas vectorized way to tag first occurring value(m elements) in a series of m*n elements

Time:05-16

I have a pandas series of m*n elements of the following form where m=5 and n=3 :
A: [1 1 1 1 1 0 1 1 0 0 0 0 0 1 1]

I need a result series as follows :
B: [1 0 0 0 0 0 1 0 0 0 0 0 0 1 0]

m and n can be any values.

I also have supplemental data that might help.
At least one supplemental data is as follows :
HORIZON: [0 1 2 3 4 0 1 2 3 4 0 1 2 3 4]

The original series of 0 and 1 values can be derived from real data which is :
CUSIP: [CUSIP1 CUSIP1 CUSIP1 CUSIP1 CUSIP1 np.nan CUSIP2 CUSIP2 ... CUSIP3]

What have i thought about so far: Shifting the series A right and xor with A. But this idea doesnt seem to be leading anywhere since there are edge cases it would'nt solve anyway.

Its pretty straightforward using a standard for loop, but we have shifted to vectorized operations pretty much, so I really would prefer a vectorized way to do this.

thanks.

EDIT:
The solution proposed works and the result is :
A:  [1   1 1 1 1 0 1 1 0 0 0 0 0 1 1]
A': [nan 1 1 1 1 1 0 1 1 0 0 0 0 0 1] (A shifted)
A.where(A.ne(A.shift()) & A.eq(1),0)
B : [1   0 0 0 0 0 1 0 0 0 0 0 0 1 0] 

FURTHER EDIT:
There is an edge case for which the solution doesnt work. Modified solution is :
a = pandas.Series([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
x = a.shift()
x.iloc[::5] = numpy.nan
b = a.where(a.ne(x) & a.eq(1),0)

CodePudding user response:

You can do shift with eq

a.where(a.ne(a.shift()) & a.eq(1),0)
Out[32]: 
0     1
1     0
2     0
3     0
4     0
5     0
6     1
7     0
8     0
9     0
10    0
11    0
12    0
13    1
14    0
dtype: int64
  • Related