Home > other >  Finding first occurrence of negative value in a row
Finding first occurrence of negative value in a row

Time:10-22

Say I have the following DataFrame:

df = pd.DataFrame({'a': [12, 34, -45], 'b':[-24, 36, 48], 'c':[28, -14, 68]})

df:

 a      b     c
 12   -24    28
 34    36   -14
-45    48    68

I am looking to return the index( 1) of the first column to contain a negative number within each row, so for the example I would produce:

 a      b     c    first_neg_col
 12   -24    28                2
 34    36   -14                3
-45    48    68                1

I have ways of achieving this:

def first_negval(val_list):
    for idx, val in enumerate(val_list):
        if val < 0:
            return idx   1

df['first_neg_col'] = df[:].values.tolist()

df.first_neg_col= df['first_neg_col'].apply(lambda x: first_negbal(x))

But this seems cumbersome/inefficient. I was wondering if there was a more vectorized approach / some way of using list comprehension?

CodePudding user response:

If always exist at least one negative value use numpy.argmax for first negative value less like 0:

df['first_neg_col'] = np.argmax(df.lt(0).to_numpy(), axis=1)   1
print (df)
    a   b   c  first_neg_col
0  12 -24  28              2
1  34  36 -14              3
2 -45  48  68              1

Generally is necessary test if exist at least one negative and set to 0 in numpy.where with DataFrame.any:

df = pd.DataFrame({'a': [12, 34, -45, 1], 'b':[-24, 36, 48, 8], 'c':[28, -14, 68, 8]})

m = df.lt(0)
df['first_neg_col'] = np.where(m.any(axis=1), np.argmax(m.to_numpy(), axis=1)   1, 0)
print (df)
    a   b   c  first_neg_col
0  12 -24  28              2
1  34  36 -14              3
2 -45  48  68              1
3   1   8   8              0
  • Related