I have two dataframes both with 6 rows. I want to multiply the values in two selected columns from the two dataframes (one from each df)
result = sum(a * b for a, b in zip(list(df1['col1']), list(df2['col3'])))
I do not seem to get what I want. I did the calc "manually" in Excel (for one date in my time series), which gave me the expected result. So my question is if I did something wrong?
CodePudding user response:
If same number of rows and same indices simple subtract and then use sum
:
result = (df1['col1'] * df2['col3']).sum()
If possible different indices but same length:
result = (df1['col1'] * df2['col3'].to_numpy()).sum()
If possible different length and different indices:
result = (df1['col1'].reset_index(drop=True)
.mul(df2['col3'].reset_index(drop=True), fill_value=1).sum()
CodePudding user response:
You can do it like this:
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'col1':[0, 1, 2, 3, 4, 5]})
df2 = pd.DataFrame({'col1':[0, 1, 2, 3, 4, 5]})
np.matmul(df1.col1, df2.col1)
This will also sum the multiplications.
Your formulation works to, if you add []:
result = sum([a * b for a, b in zip(list(df1['col1']), list(df2['col1']))])
This gives the same result.
CodePudding user response:
You will can just use pandas abstractions for it.
result = df['col1'] * df['col3']
If then you want to get the sum of those result values you can just do:
sum(results)