Home > Enterprise >  How to add counter columns to get the frequency of items in dataframe
How to add counter columns to get the frequency of items in dataframe

Time:08-12

i have such dataframe:

product1 product2 product3 product4 product5 product6
straws orange melon chair bread milk
melon milk book coffee cake tea
bread bananas juice chair book straws

I wish to add 6 counting items columns along the index and counts per rows the number of times they appear in the dataframe.

CountProduct1 CountProduct2 CountProduct3 Countproduct4 Countproduct5
1 1 1 1 1
2 2 1 1 1
2 1 1 2 1

Thanks for your help.

CodePudding user response:

Given the modified explanation, here is a way to do it.

1. cumulative counts

cnt = df.apply(lambda g: g.value_counts(), axis=1).fillna(0).astype(int).cumsum()
>>> cnt
   bananas  book  bread  cake  chair  coffee  juice  melon  milk  orange  straws  tea
0  0        0     1      0     1      0       0      1      1     1       1       0  
1  0        1     1      1     1      1       0      2      2     1       1       1  
2  1        2     2      1     2      1       1      2      2     1       2       1  

2. replace each item by the corresponding count, row by row

out = pd.DataFrame([
    r.map(c) for (_, r), (_, c) in zip(df.T.items(), cnt.T.items())
]).set_axis([f'Count{k}' for k in df.columns], axis=1)

>>> out
   Countproduct1  Countproduct2  Countproduct3  Countproduct4  Countproduct5  Countproduct6
0  1              1              1              1              1              1            
1  2              2              1              1              1              1            
2  2              1              1              2              2              2            
  • Related