I am attempting to create a new column in a data frame ("occurence") as seen below that details how many times a particular id has already been seen. I understand that Counter (if turned into a list) or value_counts() will count the total number of occurences. But I am trying to structure my dataframe as follows:
id occurence
123456 1
987641 1
123456 2
987641 2
123456 3
123456 4
212212 1
Said in english, the column is basically saying, "this is the first time we've seen '123456'", "this is the first time we've seen '987641'", "this is the second time we've seen '123456'". I appreciate any help!
CodePudding user response:
A possible solution:
df['occurrence'] = df.groupby('id').transform('cumcount') 1
Output:
id occurence
0 123456 1
1 987641 1
2 123456 2
3 987641 2
4 123456 3
5 123456 4
6 212212 1
CodePudding user response:
So you always want to add an entry, not updating the value? Thats going to be a huge list