Home > Software engineering >  how to apply multiplication within pandas dataframe
how to apply multiplication within pandas dataframe

Time:06-02

please advice how to get the following output:

df1 = pd.DataFrame([['1, 2', '2, 2','3, 2','1, 1', '2, 1','3, 1']])
df2 = pd.DataFrame([[1, 2, 100, 'x'], [3, 4, 200, 'y'], [5, 6, 300, 'x']])

import numpy as np
df22 = df2.rename(index = lambda x: x   1).set_axis(np.arange(1, len(df2.columns)   1), inplace=False, axis=1)

f = lambda x: df22.loc[tuple(map(int, x.split(',')))]
df = df1.applymap(f)
print (df)
Output:
   0  1  2  3  4  5
0  2  4  6  1  3  5

df1 is 'address' of df2 in row, col format (1,2 is first row, second column which is 2, 2,2 is 4 3,2 is 6 etc.)

I need to add values from the 3rd and 4th columns to get something like (2*100x, 4*200y, 6*300x, 1*100x, 3*200y, 5*300x)

the output should be 5000(sum of x's and y's), 0.28 (1400/5000 - % of y's)

CodePudding user response:

It's not clear to me why you need df1 and df... Maybe your question is lacking some details?

You can compute your values directly:

df22['val'] = (df22[1]   df22[2])*df22[3]

Output:

   1  2    3  4   val
1  1  2  100  x   300
2  3  4  200  y  1400
3  5  6  300  x  3300

From there it's straightforward to compute the sums (total and grouped by column 4):

total = df22['val'].sum() # 5000
y_sum = df22.groupby(4).sum().loc['y', 'val'] # 1400
print(y_sum/total) # 0.28

Edit: if df1 doesn't necessarily contain all members of columns 1 and 2, you could loop through it (it's not clear in your question why df1 is a Dataframe or if it can have more than one row, therefore I flattened it):

df22['val'] = 0

for c in df1.to_numpy().flatten():
    i, j = map(int, c.split(','))
    df22.loc[i, 'val']  = df22.loc[i, j]*df22.loc[i, 3]

This gives you the same output as above for your example but will ignore values that are not in df1.

  • Related