I combined two dataframes in the following way:
df1 = pd.DataFrame(
{
"Farm ID": ["1", "2", "2", "3", "3"],
"Crop": ["Type A", "Type A", "Type B", "Type A", "Type B"],
"Area": [8, 4, 2, 3, 5],
"Diesel": [101, 215, 3, 0.6, 42],
"Gas": [15, 35, 61, 95, 13]
}
)
df1 = df1.set_index(['Farm ID', 'Crop'])
df2 = pd.DataFrame(
{
"Name": ["Area", "Diesel", "Gas"],
"GHG": [690, 8.5, 13],
"LU": [2.2, 0.3, 1.1],
}
)
df2 = df2.set_index('Name')
s = df1.stack()
out = df2.reindex(s.index.get_level_values(2)).mul(s.values,axis=0)
out.index = s.index
out = out.rename_axis(['Farm ID','Crop', 'Type'])
So it now looks like this:
GHG LU
Farm ID Crop Type
1 Type A Area 5520.0 17.60
Diesel 858.5 30.30
Gas 195.0 16.50
2 Type A Area 2760.0 8.80
Diesel 1827.5 64.50
Gas 455.0 38.50
Type B Area 1380.0 4.40
Diesel 25.5 0.90
Gas 793.0 67.10
3 Type A Area 2070.0 6.60
Diesel 5.1 0.18
Gas 1235.0 104.50
Type B Area 3450.0 11.00
Diesel 357.0 12.60
Gas 169.0 14.30
I now want to sum in each of the groups Fuel=Diesel Gas. So for each Farm ID and Crop Type I want to have in the end the Types Area and Fuel
In a 1-dim dataframe, I would create the new row as df.loc['Fuel'] = df.loc['Diesel'] df.loc['Gas']
but this doesn't work here. Any suggestions?
CodePudding user response:
IIUC, reset the index of your dataframe then replace Diesel
and Gas
names by Fuel
on Type
column, then groupby
again and sum
:
d = {'Diesel': 'Fuel', 'Gas': 'Fuel'}
out = out.reset_index().replace({'Type': d}).groupby(['Farm ID', 'Crop', 'Type']).sum()
print(out)
# Output:
GHG LU
Farm ID Crop Type
1 Type A Area 5520.0 17.60
Fuel 1053.5 46.80
2 Type A Area 2760.0 8.80
Fuel 2282.5 103.00
Type B Area 1380.0 4.40
Fuel 818.5 68.00
3 Type A Area 2070.0 6.60
Fuel 1240.1 104.68
Type B Area 3450.0 11.00
Fuel 526.0 26.90