Home > Blockchain >  Pandas and Numpy consecutive non Nan values
Pandas and Numpy consecutive non Nan values

Time:11-20

I'm trying to use np.where to count consecutive non-NaN values longer than a certain length as shown below:

e.g. If there are more than 3 consecutive non-NaN values then return True.

Would appreciate any help!

value consecutive
nan False
nan False
1 False
1 False
nan False
4 True
2 True
3 True
nan False
nan False
1 True
3 True
3 True
5 True

CodePudding user response:

The idea is to create groups by testing missing values and mapping using Series.map with Series.value_counts to have only rows with non NaNs filtered by inverted mask ~m:

#convert values to numeric
df['value'] = df['value'].astype(float)

m = df['value'].isna()
s = m.cumsum()

N = 3
df['new'] = s.map(s[~m].value_counts()).ge(N) & ~m
print (df)
    value  consecutive    new
0     NaN        False  False
1     NaN        False  False
2     1.0        False  False
3     1.0        False  False
4     NaN        False  False
5     4.0         True   True
6     2.0         True   True
7     3.0         True   True
8     NaN        False  False
9     NaN        False  False
10    1.0         True   True
11    3.0         True   True
12    3.0         True   True
13    5.0         True   True
  • Related