Create dataset from another basing on first occurrence of some number-CodePudding

I have some dataset which looks like [3,4,5,-5,4,5,6,3,2-6,6] I want to create a dataset that will always have 0 for indexes which match first sequence of positive numbers from dataset 1, and 1 for indexes which remain.

So for a = [3,4,5,-5,4,5,6,3,2-6,6] it should be
       b = [0,0,0, 1,1,1,1,1,1,1]

How can produce b from a if I use pandas and python ?

CodePudding user response：

Since you tagged pandas, here is a solution using a Series:

import pandas as pd

s = pd.Series([3, 4, 5, -5, 4, 5, 6, 3, 2 - 6, 6])

# find the first index that is greater than zero
idx = (s > 0).idxmin()

# using the index set all the values before as 0, otherwise 1
res = pd.Series(s.index >= idx, dtype=int)
print(res)

Output

0    0
1    0
2    0
3    1
4    1
5    1
6    1
7    1
8    1
9    1
dtype: int64

If you prefer a one-liner:

res = pd.Series(s.index >= (s > 0).idxmin(), dtype=int)

CodePudding user response：

You can use a cummax on the boolean series:

s = pd.Series([3, 4, 5, -5, 4, 5, 6, 3, 2 - 6, 6])

out = s.lt(0).cummax().astype(int)

Output:

0    0
1    0
2    0
3    1
4    1
5    1
6    1
7    1
8    1
9    1
dtype: int64

If you are really working with lists, then pandas is not needed and numpy should be more efficient:

import numpy as np

a = [3,4,5,-5,4,5,6,3,2-6,6]
b = np.maximum.accumulate(np.array(a)<0).astype(int).tolist()

Output: [0, 0, 0, 1, 1, 1, 1, 1, 1, 1]

And if the list is small, pure python should be preferred:

from itertools import accumulate

b = list(accumulate((int(x<0) for x in a), max))

Output: [0, 0, 0, 1, 1, 1, 1, 1, 1, 1]