I want to get all the users from a dataframe where a specific column goes from 1
to 0
.
For example, with the following dataframe I want to keep only user 1 and 2 as their values go from 1
to 0
.
Relevant rows
- Row 6 to 7 for user 1
- Row 9 to 10 for user 2
user value
0 0 0
1 0 0
2 0 1
3 0 1
4 1 0
5 1 1
6 1 1
7 1 0
8 2 1
9 2 1
10 2 0
11 2 0
Desired Result
user value
4 1 0
5 1 1
6 1 1
7 1 0
8 2 1
9 2 1
10 2 0
11 2 0
I have tried window functions and conditions but for some reason I cannot get the desired result.
CodePudding user response:
Let us try cummax
df.loc[df.user.isin(df.loc[df.value != df.groupby('user')['value'].cummax(),'user'])]
Out[769]:
user value
4 1 0
5 1 1
6 1 1
7 1 0
8 2 1
9 2 1
10 2 0
11 2 0
CodePudding user response:
You can use GroupBy.filter
. If any diff
(difference of successive values) is equal to -1 (0-1), keep the group.
df.groupby('user').filter(lambda g: g['value'].diff().eq(-1).any())
NB. this assumes you only have 0 and 1s, if you can have other numbers you also need to use two conditions: (g['value'].eq(1)&g['value'].shift(-1).eq(0)).any()
output:
user value
4 1 0
5 1 1
6 1 1
7 1 0
8 2 1
9 2 1
10 2 0
11 2 0
CodePudding user response:
You can use shift
to check if the next value is 1
(df.value.shift(1).eq(1)
), and combine that with a mask checking if the current value is 0
(df.value.eq(0)
). Then, group by 'user'
and transform('any')
to create the appropriate mask:
filtered = df[(df.value.eq(0) & df.value.shift(1).eq(1)).groupby(df.user).transform('any')]
Output:
>>> filtered
user value
4 1 0
5 1 1
6 1 1
7 1 0
8 2 1
9 2 1
10 2 0
11 2 0