This is my pandas dataframe:
df = pd.DataFrame({'a': [20, 21, 333, 444], 'b': [20, 20, 20, 20]})
I want to create column c
by using this mask
:
mask = (df.a >= df.b)
and I want to get the second row that meets this condition and create column c
.
The output that I want looks like this:
a b c
0 20 20 NaN
1 21 20 x
2 333 20 NaN
3 444 20 NaN
And this is my try but it put 'x' on all rows after the first row:
df.loc[mask.cumsum().ne(1) & mask, 'c'] = 'x'
CodePudding user response:
Your condition is to mark all matching rows but the first one (mask.cumsum().ne(1)
).
If you want to mark only the second use:
mask = (df.a >= df.b)
df.loc[mask.cumsum().eq(2) & mask, 'c'] = 'x'
Output:
a b c
0 20 20 NaN
1 21 20 x
2 333 20 NaN
3 444 20 NaN