Dataframe 1 looks like this:
df1 = pd.DataFrame(
{
"Farm ID": ["1", "2", "2", "3", "3"],
"Crop": ["Type A", "Type A", "Type B", "Type B", "Type B"],
"Area": [8, 4, 2, 3, 5],
"Diesel": [101, 215, 3, 0.6, 42],
}
)
df1 = df1.set_index(['Farm ID', 'Crop'])
df1
Dataframe 2 looks like this:
df2 = pd.DataFrame(
{
"Name": ["Area", "Diesel"],
"GHG": [690, 8.5],
"LU": [2.2, 0.3],
}
)
df2 = df2.set_index('Name')
df2
I now need to combine both such that I receive the following information:
GHG LU
Farm ID Crop Name
1 Type A Area 8*690 8*2.2
Diesel 101*690 101*2.2
2 Type A Area 4*690 4*2.2
Diesel 215*690 215*2.2
Type B Area ....
Any suggestions welcome as I am completely clueless. I also take ideas if there are better ways to structure this. I will have to do further analysis (e.g. aggregation by crop type or name, and similar) on the resulting dataframe and might think too complicated... Thanks a lot!
CodePudding user response:
We can do stack
s = df1.stack()
out = df2.reindex(s.index.get_level_values(2)).mul(s.values,axis=0)
out.index = s.index
out
GHG LU
Farm ID Crop
1 Type A Area 5520.0 17.60
Diesel 858.5 30.30
2 Type A Area 2760.0 8.80
Diesel 1827.5 64.50
Type B Area 1380.0 4.40
Diesel 25.5 0.90
3 Type B Area 2070.0 6.60
Diesel 5.1 0.18
Area 3450.0 11.00
Diesel 357.0 12.60
CodePudding user response:
You can stack
the dataframe and let pandas broadcast on the common index:
df1.rename_axis('Name', axis=1).stack().mul(df2.T).T
Output:
GHG LU
Farm ID Crop Name
1 Type A Area 5520.0 17.60
Diesel 858.5 30.30
2 Type A Area 2760.0 8.80
Diesel 1827.5 64.50
Type B Area 1380.0 4.40
Diesel 25.5 0.90
3 Type B Area 2070.0 6.60
Diesel 5.1 0.18
Area 3450.0 11.00
Diesel 357.0 12.60