I have a dataframe that looks like the following (actually, this is the abstracted result of a calculation):
import pandas as pd
data = {"A":[i for i in range(10)]}
index = [1, 3, 4, 5, 9, 10, 12, 13, 15, 20]
df = pd.DataFrame(index=index, data=data)
print(df)
yields:
A
1 0
3 1
4 2
5 3
9 4
10 5
12 6
13 7
15 8
20 9
Now I want to filter the index values to only show the first value in a group of consecutive values e. g. the following result:
A
1 0
3 1
9 4
12 6
15 8
20 9
Any hints on how to achieve this efficiently?
CodePudding user response:
Use Series.diff
which is not implemented for Index
, so convert to Series
and compre for not equal 1
:
df = df[df.index.to_series().diff().ne(1)]
print (df)
A
1 0
3 1
9 4
12 6
15 8
20 9
CodePudding user response:
Try this one:
import numpy as np
df.iloc[np.unique(np.array(index)-np.arange(len(index)), return_index=True)[1]]
CodePudding user response:
Try this:
df.groupby('A').index.first().reset_index()