How can i remove consecutive pairs of equal numbers with opposite signs from a Pandas dataframe?
Assuming i have this input dataframe
incremental_changes = [2, -2, 2, 1, 4, 5, -5, 7, -6, 6]
df = pd.DataFrame({
'idx': range(len(incremental_changes)),
'incremental_changes': incremental_changes
})
idx incremental_changes
0 0 2
1 1 -2
2 2 2
3 3 1
4 4 4
5 5 5
6 6 -5
7 7 7
8 8 -6
9 9 6
I would like to get the following
idx incremental_changes
0 0 2
3 3 1
4 4 4
7 7 7
Note that the first 2
could either be idx 0 or 2, it doesn't really matter.
Thanks
CodePudding user response:
Can groupby consecutive equal numbers and transform
import itertools
def remove_duplicates(s):
''' Generates booleans that indicate when a pair of ints with
opposite signs are found.
'''
iter_ = iter(s)
for (a,b) in itertools.zip_longest(iter_, iter_):
if b is None:
yield False
else:
yield a b == 0
yield a b == 0
>>> mask = df.groupby(df['incremental_changes'].abs().diff().ne(0).cumsum()) \
['incremental_changes'] \
.transform(remove_duplicates)
Then
>>> df[~mask]
idx incremental_changes
2 2 2
3 3 1
4 4 4
7 7 7
CodePudding user response:
Just do rolling
, then we filter the multiple combine
s = df.incremental_changes.rolling(2).sum()
s = s.mask(s[s==0].groupby(s.ne(0).cumsum()).cumcount()==1)==0
df[~(s | s.shift(-1))]
Out[640]:
idx incremental_changes
2 2 2
3 3 1
4 4 4
7 7 7