Is there a way to get the data after a specific condition in Pandas?-CodePudding

i want to know if there is a way to take the data from a dataframe after a specific condition, and keep taking that data until another condition is applied. I have the following dataframe:

    column_1 column_2
0          1        a
1          1        a
2          1        b
3          4        b
4          4        c
5          4        c
6          0        d
7          0        d
8          0        e
9          4        e
10         4        f
11         4        f
12         1        g
13         1        g

I want to select from this dataframe only the rows when in column_1 when it changes from 1->4 and stays 4 until it changes to another value, as follow:

    column_1 column_2
3          4        b
4          4        c
5          4        c

Is there a way to do this in Pandas and not make them lists?

CodePudding user response：

You can create helper column for groups by duplicated values new first, then test if shifted values is 1 compare with actual row and for these rows get new values. Last compare new column by filtered values for all duplicated 4 rows:

df['new'] = df['column_1'].ne(df['column_1'].shift()).cumsum()
s = df.loc[df['column_1'].shift().eq(1) & df['column_1'].eq(4), 'new']

df = df[df['new'].isin(s)]
print (df)
   column_1 column_2  new
3         4        b    2
4         4        c    2
5         4        c    2

CodePudding user response：

Another option is to find the cut off points using shift eq; then use groupby.cummax to create a boolean filter:

df[(df['column_1'].shift().eq(1) & df['column_1'].eq(4)).groupby(df['column_1'].diff().ne(0).cumsum()).cummax()]

Output:

   column_1 column_2
3         4        b
4         4        c
5         4        c