Home > Back-end >  Pandas, create 'order' column based on other column [duplicate]
Pandas, create 'order' column based on other column [duplicate]

Time:09-22

I have dataframe with two columns, cluster titles and the chapter that they belong in. I would like to create a third column, containing the 'order' or location of that cluster in the chapter.

So, I would like to turn the following dataframe:

cluster_title, chapter
"rabbits",   1
"horses",    1
"cows",      1
"trains",    2
"airplanes", 2
"ships",     2
"carrot",    3
"potato",    3
"tomato",    3

Into something like this:

cluster_title, chapter, position_in_chapter,
"rabbits",   1, 1
"horses"     1, 2
"cows",      1, 3
"trains",    2, 1
"airplanes", 2, 2
"ships",     2, 3
"carrot",    3, 1
"potato",    3, 2
"tomato",    3, 3

I tried approaching it with the group_by function and using the index somehow, but either I am missing something obvious (quite likely) or it is the wrong approach as the resulting object requires extra steps that seem to take me in the wrong direction.

Could someone point me in the right direction?

CodePudding user response:

Try with groupby and cumcount:

df["position_in_chapter"] = df.groupby("chapter").cumcount() 1

>>> df
  cluster_title  chapter  position_in_chapter
0       rabbits        1                    1
1        horses        1                    2
2          cows        1                    3
3        trains        2                    1
4     airplanes        2                    2
5         ships        2                    3
6        carrot        3                    1
7        potato        3                    2
8        tomato        3                    3
  • Related