I have a pandas df
as follows:
Date Price
2020-01-01 0
2020-01-02 0
2020-01-03 0
2020-01-04 5
2020-01-05 5
2020-01-06 0
2020-01-07 5
2020-01-08 5
2020-01-09 0
2020-01-10 0
I want to drop all the zeroes before the first non-zero value and all the zeroes after the last non-zero value.
The resulting df
would look like:
Date Price
2020-01-04 5
2020-01-05 5
2020-01-06 0
2020-01-07 5
2020-01-08 5
Please do notice that the 0
value on 2020-01-06
is preserved as it has a non-zero number before and after.
How can I do this in pandas? Doing via loop seems very inefficient as I have a large df
.
CodePudding user response:
You could use ne
cummax
twice; once to cutoff the 0s at the top; then reverse the DataFrame and cutoff the 0s at the bottom:
out = df[df['Price'].ne(0).cummax() & df['Price'][::-1].ne(0).cummax()]
Output:
Date Price
3 2020-01-04 5
4 2020-01-05 5
5 2020-01-06 0
6 2020-01-07 5
7 2020-01-08 5