Count the group occurrences-CodePudding

I have dataframe

      id webpage
       1   google
       2    bing
       3   google
       4   google
       5   yahoo
       6   yahoo
       7   google
       8   google

Would like to count the groups

       id webpage count
       1   google  1
       2    bing   2
       3   google  3
       4   google  3
       5   yahoo   4
       6   yahoo   4
       7   google  5
       8   google  5

I have tried using the cumcount or ngroup when using groupby it is grouping all occurrence.

CodePudding user response：

I believe you need to cumsum() over the state transitions. Every time webpage differs from the previous row you increase your count.

df["count"] = (df.webpage != df.webpage.shift()).cumsum()

CodePudding user response：

I m not so used to pandas but I just made a quick dataframe and made a program to get the expected result. I made a variable count whose value increase when we find a data which is different than last data (_data) the current count value is added to the list of all counts then finally after getting all the counts. the count colum is added to the dataframe.

import pandas as pd
webpage_list=['google', 'bing','google','google','yahoo','yahoo','google','google']
df=pd.DataFrame(webpage_list,columns=['webpage'])
count=0
counts=[]
_data=''
for data in df['webpage']:
   if data!=_data:
       count =1
   _data=data
   counts.append(count)
df['count']=counts
print(df)