Home > Net >  How to add a new row by summing the two rows with different multiindex in pandas dataframe?
How to add a new row by summing the two rows with different multiindex in pandas dataframe?

Time:10-05

I have a pandas dataframe which looks as follows (Note: df.to_dict() is given at end of the question:

model   AIM/CGE 2.0 AIM/CGE 2.1 IMAGE 3.0.1
scenario    SSP1-19 SSP2-19 TERL_15D_LowCarbonTransportPolicy   TERL_15D_NoTransportPolicy  IMA15-LiStCh
module  variable                    
energy_investment_solar Investment|Energy Supply|Electricity|Solar  NaN NaN 7205.6445   8676.6235   0.0
energy_investment_wind  Investment|Energy Supply|Electricity|Wind   NaN NaN 11054.4915  11674.4485  0.0

There are two indices: module and variable. I want to make a sum of the two rows and append it as a new row. The new row should have index energy_investment_vre.

I get the sum of the rows by

sum = df.loc["energy_investment_solar"].values   df.loc["energy_investment_wind"].values

However, I cannot append it as a new row. I tried:

df.loc["energy_investment_vre"] = df.loc["energy_investment_solar"].values   df.loc["energy_investment_wind"].values

This gave me a

ValueError: cannot set a row with mismatched columns

I also tried:

df.loc["energy_investment_vre","Investment|Energy Supply|Electricity|VRE"] = df.loc["energy_investment_solar"].values   df.loc["energy_investment_wind"].values

This gave me a

IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

I also tried df.append([sum]) This gave me a

ValueError: Must pass 2-d input. shape=(1, 1, 5)

What would be the correct way to get the sum of these two rows as a row? Can I also have a new variable name Investment|Energy Supply|Electricity|VRE for the new row?

df.to_dict() looks as follows:
{('AIM/CGE 2.0',
  'SSP1-19'): {('energy_investment_solar',
   'Investment|Energy Supply|Electricity|Solar'): nan, ('energy_investment_wind',
   'Investment|Energy Supply|Electricity|Wind'): nan},
 ('AIM/CGE 2.0',
  'SSP2-19'): {('energy_investment_solar',
   'Investment|Energy Supply|Electricity|Solar'): nan, ('energy_investment_wind',
   'Investment|Energy Supply|Electricity|Wind'): nan},
 ('AIM/CGE 2.1',
  'TERL_15D_LowCarbonTransportPolicy'): {('energy_investment_solar',
   'Investment|Energy Supply|Electricity|Solar'): 7205.644499999999, ('energy_investment_wind',
   'Investment|Energy Supply|Electricity|Wind'): 11054.4915},
 ('AIM/CGE 2.1',
  'TERL_15D_NoTransportPolicy'): {('energy_investment_solar',
   'Investment|Energy Supply|Electricity|Solar'): 8676.6235, ('energy_investment_wind',
   'Investment|Energy Supply|Electricity|Wind'): 11674.448499999999},
 ('IMAGE 3.0.1',
  'IMA15-LiStCh'): {('energy_investment_solar',
   'Investment|Energy Supply|Electricity|Solar'): 0.0, ('energy_investment_wind',
   'Investment|Energy Supply|Electricity|Wind'): 0.0}}

CodePudding user response:

Specify the tuple for the multi-index as well as all columns (using :) like so:

df.loc[('energy_investment_vre', 'Investment|Energy Supply|Electricity|VRE'),:] = df.sum()
  • Related