Home > database >  multiply row from one dataframe with matching column in another dataframe and add
multiply row from one dataframe with matching column in another dataframe and add

Time:08-13

say i have 2 dfs:

df1 = pd.DataFrame( {'spot': [1,2,3], 'ID1': [0,1,0], 'ID2': [1, 1, 2], 'ID3': [0,0,1]})
df2 = pd.DataFrame( {'ID': ['ID1','ID2','ID3'], 'prop1': [0.01,0.8,0.0], 'prop2': [0.0, 0.04, 0.5]})
df1
   spot  ID1  ID2  ID3
0     1    0    1    0
1     2    1    1    0
2     3    0    2    1
df2
    ID  prop1  prop2
0  ID1   0.01   0.00
1  ID2   0.80   0.04
2  ID3   0.00   0.50

i'm struggling to figure out how to multiply the values from each colID in df1 with the matching rowID from 'df2'.

As an intermediary, the idea is that i would end up with something like: (it won't actually be this, but this is the best way to display what i'm aiming for)

   spot        ID1                ID2             ID3
0     1         0           [1*0.8, 1*0.04]        0
1     2   [1*0.01, 1*0.0]   [1*0.8, 1*0.04]        0
2     3         0           [2*0.8, 2*0.04]  [1*0.0, 1*0.5]

And I ultimately want to end up with a (spot, prop1, prop2) dataframe where all the (newly) multiplied values of prop1 and prop2 are added in each row, as in:

   spot  prop1            prop2 
0     1 (1*0.8)          (1*0.04)    
1     2 (1*0.01   1*0.8) (1*0.04)    
2     3 (2*0.8   0)      (2*0.04   1*0.5)

I'm already stuck on the first multiplication bit... i tried:

arr = df1.iloc[:, 1:].to_numpy()
arr2 = df2.iloc[:, 1:].to_numpy()

for i in range(len(arr)):
    for j in range(len(arr)):
        print(arr[i, j] * arr2[i])

but it's not returning what i'm expecting...

CodePudding user response:

I believe you can do:

df1.set_index('spot') @ df2.set_index('ID')

Output:

      prop1  prop2
spot              
1      0.80   0.04
2      0.81   0.04
3      1.60   0.58

Reset index if you like.

  • Related