I have a data frame like this:
df:
number score
12 NaN
13 NaN
14 NaN
15 NaN
16 10
17 NaN
18 NaN
- I want to filter this data frame from the start to the row where it finds a number in the score column. So, after filtering the data frame should look like this:
new_df: number score 12 NaN 13 NaN 14 NaN 15 NaN 16 10
- I want to filter this data frame from the row where it finds a number in the score column to the end of the data frame. So, after filtering the data frame should look like this:
new_df: number score 16 10 17 NaN 18 NaN
How do I filter this data frame? Kindly help
CodePudding user response:
You can use a reverse cummax
and boolean slicing:
new_df = df[df['score'].notna()[::-1].cummax()]
Output:
number score
0 12 NaN
1 13 NaN
2 14 NaN
3 15 NaN
4 16 10.0
For the second one, a simple cummax
:
new_df = df[df['score'].notna().cummax()]
Output:
number score
4 16 10.0
5 17 NaN
6 18 NaN
CodePudding user response:
You can use pd.Series.last_valid_index
and pd.Series.first_valid_index
like this:
df.loc[df['score'].first_valid_index():]
Output:
number score
4 16 10.0
5 17 NaN
6 18 NaN
And,
df.loc[:df['score'].last_valid_index()]
Output:
number score
0 12 NaN
1 13 NaN
2 14 NaN
3 15 NaN
4 16 10.0
And, if you wanted to clip leading NaN and trailing Nan you can combined the two.
df.loc[df['score'].first_valid_index():df['score'].last_valid_index()]
Output:
number score
4 16 10.0