Home > Blockchain >  convert rows of consecutive numbers in a column into two columns with start and end information (pan
convert rows of consecutive numbers in a column into two columns with start and end information (pan

Time:06-21

I have a data frame like one below with a score for every index i.

i   score
5   3.0
6   3.0
7   3.0
8   11.0
9   11.0
15  10.0
30  1.0
31  1.0
32  1.0
10  8.0
11  8.0
20  1.0
21  1.0
22  1.0

I would like to collapse the information in the first column when the score is identical. The anticipated results would look like this:

start   end   score
5       7      3.0
8       9     11.0
15     15     10.0
30     32     1.0
10     11     8.0
20     22     1.0

CodePudding user response:

You can group by consecutive values and aggregate the find the endpoints. Even works for single groups, where the start and end are the same.

df.groupby(df["score"].ne(df["score"].shift()).cumsum()).agg(
    start=("i", "first"), end=("i", "last"), score=("score", "first")
)


       start  end  score
score
1          5    7    3.0
2          8    9   11.0
3         15   15   10.0
4         30   32    1.0
5         10   11    8.0
6         20   22    1.0

CodePudding user response:

You don't need a magic trick here. for loops will do with the least headaches I would say.

  • Related