I have dataframe
id webpage
1 google
2 bing
3 google
4 google
5 yahoo
6 yahoo
7 google
8 google
Would like to count the groups
like
id webpage count
1 google 1
2 bing 2
3 google 3
4 google 3
5 yahoo 4
6 yahoo 4
7 google 5
8 google 5
I have tried using the cumcount
or ngroup
when using groupby it is grouping all occurrence.
CodePudding user response:
I believe you need to cumsum()
over the state transitions. Every time webpage
differs from the previous row you increase your count
.
df["count"] = (df.webpage != df.webpage.shift()).cumsum()
CodePudding user response:
I m not so used to pandas but I just made a quick dataframe and made a program to get the expected result. I made a variable count whose value increase when we find a data which is different than last data (_data) the current count value is added to the list of all counts then finally after getting all the counts. the count colum is added to the dataframe.
import pandas as pd
webpage_list=['google', 'bing','google','google','yahoo','yahoo','google','google']
df=pd.DataFrame(webpage_list,columns=['webpage'])
count=0
counts=[]
_data=''
for data in df['webpage']:
if data!=_data:
count =1
_data=data
counts.append(count)
df['count']=counts
print(df)