Home > Software engineering >  Python: How can I add sequence numbers to groups?
Python: How can I add sequence numbers to groups?

Time:03-24

How can I add sequence numbers to grouped numbers in a dataframe? Like so:

sequence numbers enter image description here

I tryed it with df.groupby().cumcount() but that didn't work

For example:

import pandas as pd

tmp = pd.DataFrame({'group Nr':[50,50,50,53,53,53,53,56,56,59,59,59]})
tmp['sequential Nr'] = tmp.groupby('group Nr').cumcount()
tmp.sort_values('group Nr')

print(tmp)

will give me:

    group Nr  sequential Nr
0         50           0
1         50           1
2         50           2
3         53           0
4         53           1
5         53           2
6         53           3
7         56           0
8         56           1
9         59           0
10        59           1
11        59           2

That is not exactly what i was looking for, as you can see.

CodePudding user response:

As Shiping mentioned, it would be helpful to have a little more context.

Assuming you're just trying to take the "group Nr" column and add by an integer that itself increments by 1 for each new group, you could do something like the below. It's a bit hacky but I think it at least works for your use case. Assuming your "group Nr" is increasing, you might want to sort first just in case.

df.sort_values("group Nr", inplace=True)
df["new group Nr"] = df["group Nr"]   df.groupby("group Nr").ngroup()   1

ngroup simply numbers each of the groups starting from 0, so I added 1, then added that result to your "group Nr" variable.

If you just want to create a sequence number column:

df.sort_values("group Nr", inplace=True)
df["sequence Nr"] = df.groupby("group Nr").ngroup()   1

CodePudding user response:

You can reach the target by the following code.

import pandas as pd

tmp = pd.DataFrame({'group Nr':[50,50,50,53,53,53,53,56,56,59,59,59]})
s_df = tmp.groupby('group Nr').head(1)
s_df['sequential Nr'] = range(1, len(s_df) 1)
tmp = tmp.merge(s_df, on='group Nr', how='left')

print(tmp)
  • Related