I have the following pandas dataframe. There are many NaN but there are lots of NaN
value (I skipped the NaN
value to make it look shorter).
0 NaN
...
26 NaN
27 357.0
28 357.0
29 357.0
30 NaN
...
246 NaN
247 357.0
248 357.0
249 357.0
250 NaN
...
303 NaN
304 58.0
305 58.0
306 58.0
307 58.0
308 58.0
309 58.0
310 58.0
311 58.0
312 58.0
313 58.0
314 58.0
315 58.0
316 NaN
...
333 NaN
334 237.0
I would like to filter all the NaN
value and also only keep the first value out of the NaN
(e.g. from index 27-29 there are three values, I would like to keep the value indexed 27 and skip the 28 and 29 value). The targeted array should be as follows:
27 357.0
247 357.0
304 58.0
334 237.0
I am not sure how could I keep only the first value. Thanks in advance.
CodePudding user response:
Take only values that aren't nan, but the value before them is nan:
df = df[df.col1.notna() & df.col1.shift().isna()]
Output:
col1
27 357.0
247 357.0
304 58.0
334 237.0
Assuming all values are greater than 0, we could also do:
df = df.fillna(0).diff()
df = df[df.col1.gt(0)]
CodePudding user response:
You can find the continuous index and diff to get its first value
m = (df['col'].dropna()
.index.to_series()
.diff().fillna(2).gt(1)
.reindex(range(df.index.max() 1))
.fillna(False))
out = df[m]
print(out)
col
27 357.0
247 357.0
304 58.0
334 237.0