Home > other >  Adding additional value column in df - Python
Adding additional value column in df - Python

Time:01-03

I am attempting to add a new variable value_% to the df which is a calculation using values from existing variables such that:

    value_% = (value/TOTAL (right))*100

current:

     Date         Band       value         
     03:Q1        <620       72.12               
     03:Q2        <620       71.41         
     03:Q3        <620       78.15        
     03:Q4        <620       90.06 

desired:

    Date      Band   value   value_%                
    03:Q1    <620    72.12     7.34                              
    03:Q2    <620    71.41     7.04                
    03:Q3    <620    78.15     7.35                  
    03:Q4    <620    90.06     8.76         

And is there a way i can repeat this for all 'Band' types in the df. Note the data is initially in wide form but i have converted it to long format.

    df = pd.read_csv('https://raw.githubusercontent.com/GeorgeRobbin/GeorgeRobbin.github.io/main/Credit_Score.csv')
    df = pd.melt(df3, id_vars=["Date"], var_name=("Band"))

CodePudding user response:

It might be best to compute the percentage fro df3 before the melt:

out = (df3
  .set_index('Date')
  .pipe(lambda d: d.div(d['TOTAL '], axis=0).mul(100))
  .reset_index()
 #.assign(**{'TOTAL ': df3['TOTAL ']}) # uncomment to keep original TOTAL
  .melt(id_vars=["Date"], var_name=("Band"))
)

Output:

      Date    Band       value
0    03:Q1    <620    7.449948
1    03:Q2    <620    7.038242
2    03:Q3    <620    7.346996
3    03:Q4    <620    8.759848
4    04:Q1    <620   12.072168
..     ...     ...         ...
427  19:Q4  TOTAL   100.000000
428  20:Q1  TOTAL   100.000000
429  20:Q2  TOTAL   100.000000
430  20:Q3  TOTAL   100.000000
431  20:Q4  TOTAL   100.000000

[432 rows x 3 columns]
  • Related