Home > Mobile >  Pandas: Keep only first occurance of value in group of consecutive values
Pandas: Keep only first occurance of value in group of consecutive values

Time:11-26

I have a dataframe that looks like the following (actually, this is the abstracted result of a calculation):

import pandas as pd


data = {"A":[i for i in range(10)]}
index = [1, 3, 4, 5, 9, 10, 12, 13, 15, 20]
df = pd.DataFrame(index=index, data=data)
print(df)

yields:

A
1   0
3   1
4   2
5   3
9   4
10  5
12  6
13  7
15  8
20  9

Now I want to filter the index values to only show the first value in a group of consecutive values e. g. the following result:

A
1   0
3   1
9   4
12  6
15  8
20  9

Any hints on how to achieve this efficiently?

CodePudding user response:

Use Series.diff which is not implemented for Index, so convert to Series and compre for not equal 1:

df = df[df.index.to_series().diff().ne(1)]
print (df)
    A
1   0
3   1
9   4
12  6
15  8
20  9

CodePudding user response:

Try this one:

import numpy as np

df.iloc[np.unique(np.array(index)-np.arange(len(index)), return_index=True)[1]]

CodePudding user response:

Try this: df.groupby('A').index.first().reset_index()

  • Related