I would like to multiply the combinations of two sets of columns
Let say there is a dataframe below:
import pandas as pd
df = {'A':[1,2,3], 'B':[4,5,6], 'C':[7,8,9], 'D':[0,1,2]}
df = pd.DataFrame(df)
Now, I want to multiply AC, AD, BC, BD This is like multiplying the combination of [A,B] and [C,D]
I tried to use itertools but failed to figure it out.
So, the desired output will be like:
output = {'AC':[7,16,27], 'AD':[0,2,6], 'BC':[28,40,54], 'BD':[0,5,12]}
output = pd.DataFrame(output)
CodePudding user response:
IIUC, you can try
import itertools
cols1 = ['A', 'B']
cols2 = ['C', 'D']
for col1, col2 in itertools.product(cols1, cols2):
df[col1 col2] = df[col1] * df[col2]
print(df)
A B C D AC AD BC BD
0 1 4 7 0 7 0 28 0
1 2 5 8 1 16 2 40 5
2 3 6 9 2 27 6 54 12
Or with new create dataframe
out = pd.concat([df[col1].mul(df[col2]).to_frame(col1 col2)
for col1, col2 in itertools.product(cols1, cols2)], axis=1)
print(out)
AC AD BC BD
0 7 0 28 0
1 16 2 40 5
2 27 6 54 12
CodePudding user response:
Does this work?:
df['AC'] = df['A'] * df['C']
CodePudding user response:
You can directly multiply multiple columns if you convert them to NumPy arrays first with .to_numpy()
>>> df[["A","B"]].to_numpy() * df[["C","D"]].to_numpy()
array([[ 7, 0],
[16, 5],
[27, 12]])
>>> pairs = ["AC", "AD", "BC", "BD"] # wanted pairs
>>> c1, c2 = list(zip(*pairs)) # unzip pairs
>>> result = df[list(c1)].to_numpy() * df[list(c2)].to_numpy()
>>> df2 = pd.DataFrame(result, columns=pairs) # new dataframe
>>> df2
AC AD BC BD
0 7 0 28 0
1 16 2 40 5
2 27 6 54 12