How can I add sequence numbers to grouped numbers in a dataframe? Like so:
sequence numbers
I tryed it with df.groupby().cumcount()
but that didn't work
For example:
import pandas as pd
tmp = pd.DataFrame({'group Nr':[50,50,50,53,53,53,53,56,56,59,59,59]})
tmp['sequential Nr'] = tmp.groupby('group Nr').cumcount()
tmp.sort_values('group Nr')
print(tmp)
will give me:
group Nr sequential Nr
0 50 0
1 50 1
2 50 2
3 53 0
4 53 1
5 53 2
6 53 3
7 56 0
8 56 1
9 59 0
10 59 1
11 59 2
That is not exactly what i was looking for, as you can see.
CodePudding user response:
As Shiping mentioned, it would be helpful to have a little more context.
Assuming you're just trying to take the "group Nr" column and add by an integer that itself increments by 1 for each new group, you could do something like the below. It's a bit hacky but I think it at least works for your use case. Assuming your "group Nr" is increasing, you might want to sort first just in case.
df.sort_values("group Nr", inplace=True)
df["new group Nr"] = df["group Nr"] df.groupby("group Nr").ngroup() 1
ngroup simply numbers each of the groups starting from 0, so I added 1, then added that result to your "group Nr" variable.
If you just want to create a sequence number column:
df.sort_values("group Nr", inplace=True)
df["sequence Nr"] = df.groupby("group Nr").ngroup() 1
CodePudding user response:
You can reach the target by the following code.
import pandas as pd
tmp = pd.DataFrame({'group Nr':[50,50,50,53,53,53,53,56,56,59,59,59]})
s_df = tmp.groupby('group Nr').head(1)
s_df['sequential Nr'] = range(1, len(s_df) 1)
tmp = tmp.merge(s_df, on='group Nr', how='left')
print(tmp)