Home > Back-end >  How to multiply the value in df_a by the values in df_b, take the sum of these values, and append th
How to multiply the value in df_a by the values in df_b, take the sum of these values, and append th

Time:11-11

I would like to multiply the value in one dataframe (df_a) by the values in another dataframe (df_b) and then take the sum of these values, and append them together for all values in df_a. E.g. df_a:

col_x
10
20

and df_b:

col_y
5
6

Would result in: [(10 x 5) (10 x 6), (20 x 5) (20 x 6)] or [110, 220]

I think this can be done in a for loop:

for x, y in zip(df_a, df_b):
    i = sum(x * y)
    a.append(i)

But this throws an error for float object not being iterable.

CodePudding user response:

Perfect job for numpy broadcasting:

x = df_a["col_x"].to_numpy()
y = df_b["col_y"].to_numpy()[:, None]

(x * y).sum(axis=0)

CodePudding user response:

No need for complicated loops or broadcasting. (10 x 5) (10 x 6) is equal to 10*(5 6). So first sum the second Series, then multiply the first one with this scalar.

out = df_a['col_x']*df_b['col_y'].sum()

Output:

0    110
1    220
Name: col_x, dtype: int64

As array:

out = df_a['col_x'].to_numpy()*df_b['col_y'].sum()

Output: array([110, 220])

Alternative with an outer product:

import numpy as np
np.outer(df_a['col_x'], df_b['col_y']).sum(axis=1)

Output: array([110, 220])

CodePudding user response:

You can use broadcasting.

>>> (df_a.values * df_b.values.reshape(1, -1)).sum(axis=1)
[110, 220]

Explanation:

>>> df_1.values
array([[10],
       [20]])


>>> df_2.values.reshape(1, -1)
array([[5, 6]])


>>> df_1.values * df_2.values.reshape(1, -1)
array([[ 50,  60],
       [100, 120]])

CodePudding user response:

Why are you using dataframes with one column when you are actually using them as arrays? Just use numpy:

a = np.array([10, 20])
b = np.array([5, 6])
np.matmul(a, b)   np.matmul(a, b[::-1])
# 330
  • Related