Home > Mobile >  How can i remove duplicates only when they repeat themselves in the next iteration through pandas
How can i remove duplicates only when they repeat themselves in the next iteration through pandas

Time:10-03

My question is a little bit confusing, so it's better to show what my input and output look like. I've tried working on it for a bit but I'm reaching a dead end everytime.

Input:

A B
1 a
2 a
3 b
4 b
5 c
6 c
7 a
8 a
9 b
10 c

Output:

A B
1 a
3 b
5 c
7 a
9 b
10 c

CodePudding user response:

You have to groupby like itertools.groupby here. To do something like that in pandas check if next element is not equal to curr element. We can use pd.Series.shift pd.Series.ne pd.Series.cumsum.

grps = df['B'].ne(df['B'].shift()).cumsum()
df.groupby(grps).first()

    A  B
B       
1   1  a
2   3  b
3   5  c
4   7  a
5   9  b
6  10  c
  • Related