I have the following DataFrame:
y |
---|
NaN |
NaN |
5 |
NaN |
7 |
I would like to write a function that will return the number of NaN values before the first non-NaN value. Given the above example, the function should return the value 2.
I tried to solve my problem using this question, but it did not help me much.
Edit: The values always start with a NaN. If the column is all NaN, the function should return the column length.
CodePudding user response:
You could use isna
to get True/1 on the NaN values and cumprod
to get rid of all values that follow a non-NaN. Then sum
:
df['y'].isna().cumprod().sum()
output: 2
CodePudding user response:
You can use first_valid_index
.
df.y.first_valid_index()
> 2
This grabs the index of the first non-NaN value. By default we don't need to sum if the index starts from NaN
.
CodePudding user response:
Use Series.isna
with Series.cummin
and count True
s by sum
:
s = df['y'].isna().cummin().sum()
print (s)
2