Home > Mobile >  Paython (Pandas) : count frequency of each cell in for two column
Paython (Pandas) : count frequency of each cell in for two column

Time:04-04

I have a DataFrame like this in python

 name    id     Background Complexion      Ear Ear Accessories
Li1     3800          White      Clean  Ape Ear            None
Lio2    5700  Purple Galaxy      Clean  Ape Ear  Silver Earring
Lio4    8400          Green      Clean  Orc Ear   Spiky Earring
Noam    3000    Noir Galaxy      Clean  Ape Ear  Silver Earring
AVERSE  6100    Grey Galaxy      Clean  Ape Ear  Gender Earring   

I want to count every cell except two first colmn 'name' and 'id' and create a table like this finally

name    id     Background Complexion     Ear    Ear Accessories
Li1     3800          1      5           4            1
Lio2    5700          1      5           4            2
Lio4    8400          1      5           1            1
Noam    3000          1      5           4            2
AVERSE  6100          1      5           4            1     

and calculate a formula on this numbers (for example 3) then we have

name    id     Background Complexion     Ear    Ear Accessories
Li1     3800          4      8           7            4
Lio2    5700          4      8           7            5
Lio4    8400          4      8           4            4
Noam    3000          4      8           7            5
AVERSE  6100          4      8           7            4

and sum numbers for each row

name    id     Background Complexion     Ear    Ear Accessories  sum
Li1     3800          4      8           7            4           23
Lio2    5700          4      8           7            5           24 
Lio4    8400          4      8           4            4           20
Noam    3000          4      8           7            5           24
AVERSE  6100          4      8           7            4           23

how can do this with pandas in python thaks

CodePudding user response:

IIUC, you need to loop to perform a count per column. You can use groupy.transform('count'). The rest are simple vectorial operations (add/sum):

cols = ['name', 'id']
df2 = (df[cols]
       .join(pd.DataFrame({c: df.groupby(c)[c].transform('count')
                           for c in df.drop(columns=cols).columns})
               .add(3)
               .assign(sum=lambda d: d.sum(1))
            )
      )

output:

     name    id  Background  Complexion  Ear  Ear Accessories  sum
0     Li1  3800           4           8    7                4   23
1    Lio2  5700           4           8    7                5   24
2    Lio4  8400           4           8    4                4   20
3    Noam  3000           4           8    7                5   24
4  AVERSE  6100           4           8    7                4   23
  • Related