I would like to multiply the value in one dataframe (df_a) by the values in another dataframe (df_b) and then take the sum of these values, and append them together for all values in df_a. E.g. df_a:
col_x |
---|
10 |
20 |
and df_b:
col_y |
---|
5 |
6 |
Would result in: [(10 x 5) (10 x 6), (20 x 5) (20 x 6)] or [110, 220]
I think this can be done in a for loop:
for x, y in zip(df_a, df_b):
i = sum(x * y)
a.append(i)
But this throws an error for float object not being iterable.
CodePudding user response:
Perfect job for numpy broadcasting:
x = df_a["col_x"].to_numpy()
y = df_b["col_y"].to_numpy()[:, None]
(x * y).sum(axis=0)
CodePudding user response:
No need for complicated loops or broadcasting. (10 x 5) (10 x 6)
is equal to 10*(5 6)
. So first sum the second Series, then multiply the first one with this scalar.
out = df_a['col_x']*df_b['col_y'].sum()
Output:
0 110
1 220
Name: col_x, dtype: int64
As array:
out = df_a['col_x'].to_numpy()*df_b['col_y'].sum()
Output: array([110, 220])
Alternative with an outer product:
import numpy as np
np.outer(df_a['col_x'], df_b['col_y']).sum(axis=1)
Output: array([110, 220])
CodePudding user response:
You can use broadcasting
.
>>> (df_a.values * df_b.values.reshape(1, -1)).sum(axis=1)
[110, 220]
Explanation:
>>> df_1.values
array([[10],
[20]])
>>> df_2.values.reshape(1, -1)
array([[5, 6]])
>>> df_1.values * df_2.values.reshape(1, -1)
array([[ 50, 60],
[100, 120]])
CodePudding user response:
Why are you using dataframes with one column when you are actually using them as arrays? Just use numpy
:
a = np.array([10, 20])
b = np.array([5, 6])
np.matmul(a, b) np.matmul(a, b[::-1])
# 330