say i have 2 dfs:
df1 = pd.DataFrame( {'spot': [1,2,3], 'ID1': [0,1,0], 'ID2': [1, 1, 2], 'ID3': [0,0,1]})
df2 = pd.DataFrame( {'ID': ['ID1','ID2','ID3'], 'prop1': [0.01,0.8,0.0], 'prop2': [0.0, 0.04, 0.5]})
df1
spot ID1 ID2 ID3
0 1 0 1 0
1 2 1 1 0
2 3 0 2 1
df2
ID prop1 prop2
0 ID1 0.01 0.00
1 ID2 0.80 0.04
2 ID3 0.00 0.50
i'm struggling to figure out how to multiply the values from each colID in df1
with the matching rowID from 'df2'.
As an intermediary, the idea is that i would end up with something like: (it won't actually be this, but this is the best way to display what i'm aiming for)
spot ID1 ID2 ID3
0 1 0 [1*0.8, 1*0.04] 0
1 2 [1*0.01, 1*0.0] [1*0.8, 1*0.04] 0
2 3 0 [2*0.8, 2*0.04] [1*0.0, 1*0.5]
And I ultimately want to end up with a (spot, prop1, prop2) dataframe where all the (newly) multiplied values of prop1
and prop2
are added in each row, as in:
spot prop1 prop2
0 1 (1*0.8) (1*0.04)
1 2 (1*0.01 1*0.8) (1*0.04)
2 3 (2*0.8 0) (2*0.04 1*0.5)
I'm already stuck on the first multiplication bit... i tried:
arr = df1.iloc[:, 1:].to_numpy()
arr2 = df2.iloc[:, 1:].to_numpy()
for i in range(len(arr)):
for j in range(len(arr)):
print(arr[i, j] * arr2[i])
but it's not returning what i'm expecting...
CodePudding user response:
I believe you can do:
df1.set_index('spot') @ df2.set_index('ID')
Output:
prop1 prop2
spot
1 0.80 0.04
2 0.81 0.04
3 1.60 0.58
Reset index if you like.