I have a data frame like that :
Index | Time | Id |
---|---|---|
0 | 10:10:00 | 11 |
1 | 10:10:01 | 12 |
2 | 10:10:02 | 12 |
3 | 10:10:04 | 12 |
4 | 10:10:06 | 13 |
5 | 10:10:07 | 13 |
6 | 10:10:08 | 11 |
7 | 10:10:10 | 11 |
8 | 10:10:12 | 11 |
9 | 10:10:14 | 13 |
I want to compare id
column for each pairs. So between the row 0
and 1
, between the row 2
and 3
etc.
In others words I want to compare even rows with odd rows and keep same id
pairs rows.
My ideal output would be :
Index | Time | Id |
---|---|---|
2 | 10:10:02 | 12 |
3 | 10:10:04 | 12 |
4 | 10:10:06 | 13 |
5 | 10:10:07 | 13 |
6 | 10:10:08 | 11 |
7 | 10:10:10 | 11 |
I tried that but it did not work :
df = df[
df[::2]["id"] ==df[1::2]["id"]
]
CodePudding user response:
You can use a GroupBy.transform
approach:
# for each pair, is there only one kind of Id?
out = df[df.groupby(np.arange(len(df))//2)['Id'].transform('nunique').eq(1)]
Or, more efficient, using the underlying numpy array:
# convert to numpy
a = df['Id'].to_numpy()
# are the odds equal to evens?
out = df[np.repeat((a[::2]==a[1::2]), 2)]
output:
Index Time Id
2 2 10:10:02 12
3 3 10:10:04 12
4 4 10:10:06 13
5 5 10:10:07 13
6 6 10:10:08 11
7 7 10:10:10 11