Home > Net >  how to count a change in the categorical column in pandas
how to count a change in the categorical column in pandas

Time:03-07

I have a below dataframe:

OUTLET_UNQ_CODE  Category_Code   month

0   2018020000065   SSSI    January 21

1   2018020000066   SSSI    January 21

2   2018020000067   SSSI    January 21

...

512762  2021031641195   CH  March 21

512763  2021031642445   CH  March 21

512764  2021031643357   GM  March 21

512765  2021031643863   GM  March 21

there are few OUTLET_UNQ_CODE who have changed their Category_Code within a month and next month as well. I need to count the number of hops every outlet has done. For ex: If 2021031643863 had Category_code GM in Jan 21 and CH in Jan 21 again, CH in Feb and Kirana in March. This will be counted as 2 hops.

This is what i have tried:

s=pd.to_numeric(new_df.Category_Code,errors='coerce') 

df=new_df.assign(New=s.bfill())[s.isnull()].groupby('OUTLET_UNQ_CODE').agg({'Category_Code':list}) 

df.reset_index(inplace=True) 

O/P is:

   OUTLET_UNQ_CODE  Category_Code

0   2021031643863   [GM,CH,CH,Kirana]

CodePudding user response:

regardless if there is maybe a better way starting from the beginning, to achieve the goal based on your output, here is a piece of code to get the number of changes in the list:

cat_lst = ['GM','CH','CH','Kirana']
a = sum((1 for i,x in enumerate(cat_lst[:-1]) if x!= cat_lst[i 1]))

# in this case the result of a is 2
  • Related