Home > Net >  percentage of cells to row total python
percentage of cells to row total python

Time:09-07

hi I have a dataset that looks much like this data frame below:

#Table1 :
print("Table1: Current Table")
data = [['ALFA', 35, 47, 67, 44, 193],  
        ['Bravo', 51, 52, 16, 8, 127], 
        ['Charlie', 59, 75, 2, 14, 150],  
        ['Delta', 59, 75, 2, 34, 170],
        ['Echo', 59, 75, 2, 14, 150],
        ['Foxtrot', 40, 43, 26, 27, 136], 
        ['Golf', 35, 31, 22, 13, 101], 
        ['Hotel', 89, 58, 24, 34, 205]]

df = pd.DataFrame(data, columns= ['Objects', 'Column1', 'Column2', 'Column3', 'Column4', 'Total'])
#df.loc[:,'Total'] = df.sum(axis=1)
print(df)

i would want to get the percentage of all cells against their row totals (calculated in column 'Total') such that it looks this:

#Table2 :
print('Table2: Expected Outcome')
data2 = [['ALFA',18.1, 24.4, 34.7, 22.8, 193], 
        ['Bravo',40.2, 40.9, 12.6, 6.3, 127], 
        ['Charlie',39.3, 50.0, 1.3, 9.3, 150], 
        ['Delta',34.7, 44.1, 1.2, 20.0, 170],
        ['Echo',39.3, 50.0, 1.3, 9.3, 150],
        ['Foxtrot',29.4, 31.6, 19.1, 19.9, 136],
        ['Hotel',34.7, 30.7, 21.8, 12.9, 101], 
        ['Golf',43.4, 28.3, 11.7, 16.6, 205]]
df2 = pd.DataFrame(data2, columns= ['Objects', 'Column1', 'Column2', 'Column3', 'Column4', 'Total']) #.round(decimals=1)
#df.loc[:,'Total'] = df.sum(axis=1)
print(df2)

I am not really interested if the total column does change, is recalculated or have to be dropped in the process; but for completeness sake it would be good to have a 'Total' column along with the cells' percentages

CodePudding user response:

you can try using apply :

df[['Column1', 'Column2', 'Column3', 'Column4']] = df[['Column1', 'Column2', 'Column3', 'Column4']].apply(lambda x: x/x.sum(), axis=1)

Output :

Table1: Current Table
   Objects   Column1   Column2   Column3   Column4  Total
0     ALFA  0.181347  0.243523  0.347150  0.227979    193
1    Bravo  0.401575  0.409449  0.125984  0.062992    127
2  Charlie  0.393333  0.500000  0.013333  0.093333    150
3    Delta  0.347059  0.441176  0.011765  0.200000    170
4     Echo  0.393333  0.500000  0.013333  0.093333    150
5  Foxtrot  0.294118  0.316176  0.191176  0.198529    136
6     Golf  0.346535  0.306931  0.217822  0.128713    101
7    Hotel  0.434146  0.282927  0.117073  0.165854    205

CodePudding user response:

Use fast vecorized division all columns without Objects/Total by DataFrame.div:

c = df.columns.difference(['Objects','Total'])
df[c] = df[c].div(df['Total'], axis=0).mul(100)
print(df)
   Objects    Column1    Column2    Column3    Column4  Total
0     ALFA  18.134715  24.352332  34.715026  22.797927    193
1    Bravo  40.157480  40.944882  12.598425   6.299213    127
2  Charlie  39.333333  50.000000   1.333333   9.333333    150
3    Delta  34.705882  44.117647   1.176471  20.000000    170
4     Echo  39.333333  50.000000   1.333333   9.333333    150
5  Foxtrot  29.411765  31.617647  19.117647  19.852941    136
6     Golf  34.653465  30.693069  21.782178  12.871287    101
7    Hotel  43.414634  28.292683  11.707317  16.585366    205

CodePudding user response:

df1=pd.DataFrame(data, columns= ['Objects', 'Column1', 'Column2', 'Column3', 'Column4', 'Total'])

for col in df.columns[1:-1]: df1[col]=(df[col]*100/df.Total)

df1

  • Related