I have a data frame like one below with a score
for every index i
.
i score
5 3.0
6 3.0
7 3.0
8 11.0
9 11.0
15 10.0
30 1.0
31 1.0
32 1.0
10 8.0
11 8.0
20 1.0
21 1.0
22 1.0
I would like to collapse the information in the first column when the score is identical. The anticipated results would look like this:
start end score
5 7 3.0
8 9 11.0
15 15 10.0
30 32 1.0
10 11 8.0
20 22 1.0
CodePudding user response:
You can group by consecutive values and aggregate the find the endpoints. Even works for single groups, where the start and end are the same.
df.groupby(df["score"].ne(df["score"].shift()).cumsum()).agg(
start=("i", "first"), end=("i", "last"), score=("score", "first")
)
start end score
score
1 5 7 3.0
2 8 9 11.0
3 15 15 10.0
4 30 32 1.0
5 10 11 8.0
6 20 22 1.0
CodePudding user response:
You don't need a magic trick here. for
loops will do with the least headaches I would say.