Get first occurrence of 1 in column2 based on column 1 condition-CodePudding

There is a dataframe df with 2 columns col1 and col2. Both columns have randomly spread 0s and 1s. More zeros than ones. If col1 has a 1 on an index, program should be able to look for next first 1 in col2 and get the difference of indices of both rows.

Everytime this distribution is different also the sequence length.

CodePudding user response：

Try with idxmax

id1 = df.col1.idxmax()
id2 = df.loc[id1:,'col2'].idxmax()
id2-id1
2
id2
4
id1
2

CodePudding user response：

I cannot see your posted image.

How about this.

import random

import pandas as pd


numrows = 10

df = pd.DataFrame({'c1': [random.randint(0, 1) for _ in range(numrows)], 'c2': [random.randint(0, 1) for _ in range(numrows)]})

print(df)

col1_index = None

for index, row in df.iterrows():
    if col1_index is not None:
        if row['c2'] == 1:            
            diff = col1_index - index
            print(f'first occurrence of 1 at c2 is at index {index}, the index diff is {diff}')
            col1_index = None
    elif row['c1'] == 1:
        col1_index = index
        print(f'this index {index} has value 1 at c1')

Typical output

   c1  c2
0   1   0
1   0   0
2   0   0
3   1   1
4   0   1
5   0   0
6   1   1
7   0   1
8   0   1
9   1   1
this index 0 has value 1 at c1
first occurrence of 1 at c2 is at index 3, the index diff is -3
this index 6 has value 1 at c1
first occurrence of 1 at c2 is at index 7, the index diff is -1
this index 9 has value 1 at c1