Home > Mobile >  How to filter a pandas dataframe till it finds a value in NaN column?
How to filter a pandas dataframe till it finds a value in NaN column?

Time:07-19

I have a data frame like this:

df:
number score
12      NaN
13      NaN
14      NaN
15      NaN
16      10
17      NaN
18      NaN
  1. I want to filter this data frame from the start to the row where it finds a number in the score column. So, after filtering the data frame should look like this:
new_df:
number    score
12        NaN
13        NaN
14        NaN
15        NaN
16        10
  1. I want to filter this data frame from the row where it finds a number in the score column to the end of the data frame. So, after filtering the data frame should look like this:
new_df:
number   score
16       10
17        NaN
18        NaN

How do I filter this data frame? Kindly help

CodePudding user response:

You can use a reverse cummax and boolean slicing:

new_df = df[df['score'].notna()[::-1].cummax()]

Output:

   number  score
0      12    NaN
1      13    NaN
2      14    NaN
3      15    NaN
4      16   10.0

For the second one, a simple cummax:

new_df = df[df['score'].notna().cummax()]

Output:

   number  score
4      16   10.0
5      17    NaN
6      18    NaN

CodePudding user response:

You can use pd.Series.last_valid_index and pd.Series.first_valid_index like this:

df.loc[df['score'].first_valid_index():]

Output:

   number  score
4      16   10.0
5      17    NaN
6      18    NaN

And,

df.loc[:df['score'].last_valid_index()]

Output:

   number  score
0      12    NaN
1      13    NaN
2      14    NaN
3      15    NaN
4      16   10.0

And, if you wanted to clip leading NaN and trailing Nan you can combined the two.

df.loc[df['score'].first_valid_index():df['score'].last_valid_index()]

Output:

   number  score
4      16   10.0
  • Related