Calculate cell percentage of row total and sort largest to smallest with some description using colu-CodePudding

I have a df like the one below:

data1 = [['A', 10, 30, 15, 45], 
        ['B', 22, 35, 25, 18], 
        ['C', 40, 27, 28, 5]]
       
df1 = pd.DataFrame(data1, columns= ['Units', 'Normal', 'Bad', 'Good', 'Neutral'])
df1

I want to get an additional column which calculates and shows each cell's percentage of the row total. But here is the trick, I want these percentages sorted from highest to lowest but also containing a string description of these percentages using the column name. What I want to achieve is shown in the df2 below under Proportion by largest to smallest column. My dataset is quite large and I would like to do this automatically.

data2 = [['A', 10, 30, 15, 45, 'Neutral  = 0.45%, Bad = 0.3%, Good = 0.15%, Normal = 0.10%'], 
            ['B', 22, 35, 25, 18, 'Bad = 0.35%, Good = 0.25%, Normal = 0.22%, Neutral = 0.18%'], 
            ['C', 40, 27, 28, 5, 'Normal = 0.4%, Good = 0.28%, Bad = 0.27%, Neutral = 0.05']]
           
df2 = pd.DataFrame(data2, columns= ['Units', 'Normal', 'Bad', 'Good', 'Neutral', 'Proportion by largest to smallest'])
df2

CodePudding user response：

One rare case where I would recommend a custom function and apply:

cols = df1.columns.difference(['Units'])

def cust_fmt(s):
    order = s.sort_values(ascending=False).index
    return ', '.join(
           (s.index
              ' = '   s.div(100).round(2).astype(str)   '%'
           ).loc[order]
           )

df1['Proportion'] = df1[cols].apply(cust_fmt, axis=1)

output:

  Units  Normal  Bad  Good  Neutral                                                  Proportion
0     A      10   30    15       45    Neutral = 0.45%, Bad = 0.3%, Good = 0.15%, Normal = 0.1%
1     B      22   35    25       18  Bad = 0.35%, Good = 0.25%, Normal = 0.22%, Neutral = 0.18%
2     C      40   27    28        5   Normal = 0.4%, Good = 0.28%, Bad = 0.27%, Neutral = 0.05%

alternative with normalization:

cols = df1.columns.difference(['Units'])

def cust_fmt(s):
    order = s.sort_values(ascending=False).index
    return ', '.join(
           (s.index#.to_series()
              ' = '   s.astype(str)   '%'
           ).loc[order]
           )

df1['Proportion'] = df1[cols].div(df1[cols].sum(axis=1), axis=0).round(2).apply(cust_fmt, axis=1)