I have a below dataframe:
OUTLET_UNQ_CODE Category_Code month
0 2018020000065 SSSI January 21
1 2018020000066 SSSI January 21
2 2018020000067 SSSI January 21
...
512762 2021031641195 CH March 21
512763 2021031642445 CH March 21
512764 2021031643357 GM March 21
512765 2021031643863 GM March 21
there are few OUTLET_UNQ_CODE who have changed their Category_Code within a month and next month as well. I need to count the number of hops every outlet has done. For ex: If 2021031643863 had Category_code GM in Jan 21 and CH in Jan 21 again, CH in Feb and Kirana in March. This will be counted as 2 hops.
This is what i have tried:
s=pd.to_numeric(new_df.Category_Code,errors='coerce')
df=new_df.assign(New=s.bfill())[s.isnull()].groupby('OUTLET_UNQ_CODE').agg({'Category_Code':list})
df.reset_index(inplace=True)
O/P is:
OUTLET_UNQ_CODE Category_Code
0 2021031643863 [GM,CH,CH,Kirana]
CodePudding user response:
regardless if there is maybe a better way starting from the beginning, to achieve the goal based on your output, here is a piece of code to get the number of changes in the list:
cat_lst = ['GM','CH','CH','Kirana']
a = sum((1 for i,x in enumerate(cat_lst[:-1]) if x!= cat_lst[i 1]))
# in this case the result of a is 2