Picking only the first value equals 1 in time period-CodePudding

I have the df which has index with dates and values 0 or 1. I need to filter every first 1 from this data frame in several time periods For example:

2019-11-27    0
2019-11-29    0
2019-12-02    0
2019-12-03    1
2019-12-04    1
2019-12-05    1
2020-06-01    0
2020-06-02    0
2020-06-03    1
2020-06-04    1
2020-06-05    1

From the beginning to 2020-06-03 I want to get:

2019-12-03    1

And then in the new period continuing after the previous one I need the first value with 1:

2020-06-04    1

I also have the column with the values 0 or 1 where 1 means the beginning of the period

CodePudding user response：

you could try something that counts up the amount of 1's and restarts when it hits a 0 and saves the index of that next 1

CodePudding user response：

If I understand you correctly, you want to get the first "1" of each group of ones.

See with a dataframe like :

       dates  val
0 2019-11-29    0
1 2019-12-02    0
2 2019-12-03    1
3 2019-12-04    1
4 2019-12-05    1
5 2020-06-01    0
6 2020-06-02    0
7 2020-06-03    1
8 2020-06-04    1
9 2020-06-05    1


mask = df["val"].ne(df["val"].shift(1)) & df["val"].eq(1)
df.loc[mask]

Gets you:

       dates  val
2 2019-12-03    1
7 2020-06-03    1

We get this by shifting the dataframe a step forward and checking for differences. This means that any row where the current row is different from the current row will be matched : the cases where 1 is the value and 0 is the previous value or where 0 is the value and 1 is the previous value.

Then, we add the other condition that the current row must be equal to 1, which gets each row where a 1 first appears when preceded by a 0.