Home > database >  Multiply pandas dataframe with a differently shaped dataframe based on condition
Multiply pandas dataframe with a differently shaped dataframe based on condition

Time:06-10

I have a pandas DataFrame (df_A) with this basic form:

|id| alt|  a  |  b  |  c  | d | e |
|--|----|-----|-----|-----|---|---|
| 0| ICV| 0.2 | 1.0 | 0.2 | 0 | 1 |
| 1| ICV| 1.0 | 1.0 | 0.2 | 0 | 0 |
| 2| BEV| 3.2 | 1.0 | 0.2 | 1 | 0 |
| 3| ICV| 2.0 | 1.0 | 0.2 | 0 | 0 |
| 4| BEV| 2.0 | 1.0 | 0.2 | 1 | 1 |

Furthermore I have another DataFrame (df_B):

|id| alt|  a  |  b  |  c  |
|--|----|-----|-----|-----|
| 0| ICV| 0.1 | 0.3 | 0.5 |
| 1| BEV| 0.2 | 0.4 | 0.6 |

What I want to do is multiply the values of the second DataFrame with the values of the first, where the alt value is the same. I also do not want the d or e columns to be involved in the multiplication. So I want a DataFrame (df_C) like this:

|id| alt|    a    |    b    |    c    | d | e |
|--|----|---------|---------|---------|---|---|
| 0| ICV| 0.2*0.1 | 1.0*0.3 | 0.2*0.5 | 0 | 1 |
| 1| ICV| 1.0*0.1 | 1.0*0.3 | 0.2*0.5 | 0 | 0 |
| 2| BEV| 3.2*0.2 | 1.0*0.4 | 0.2*0.6 | 1 | 0 |
| 3| ICV| 2.0*0.1 | 1.0*0.3 | 0.2*0.5 | 0 | 0 |
| 4| BEV| 2.0*0.2 | 1.0*0.4 | 0.2*0.6 | 1 | 1 |

I have tried a few thing unsuccessfully:

list = ["a","b","c"]
df_C = df_A.copy()
for i in ["BEV","ICV"]:
        df_C[list].loc[df_C["alt"]==i] = df_A[list ["alt"]].loc[df_A["alt"]==i].drop(["alt"],axis=1).mul(df_B[df_B["alt"]==i].drop(["alt"],axis=1).to_numpy(),axis=1)

This just returns the original DataFrame (df_A). I know my approach is deeply flawed but I cannot think of any way to do this.

CodePudding user response:

You can use a merge and in place multiplication:

cols = ['a', 'b', 'c']
df_A[cols] *= df_A[['alt']].merge(df_B, how='left')[cols]

output:

   id  alt     a    b     c  d  e
0   0  ICV  0.02  0.3  0.10  0  1
1   1  ICV  0.10  0.3  0.10  0  0
2   2  BEV  0.64  0.4  0.12  1  0
3   3  ICV  0.20  0.3  0.10  0  0
4   4  BEV  0.40  0.4  0.12  1  1

intermediate:

df_A[['alt']].merge(df_B, how='left')[cols]

     a    b    c
0  0.1  0.3  0.5
1  0.1  0.3  0.5
2  0.2  0.4  0.6
3  0.1  0.3  0.5
4  0.2  0.4  0.6
  • Related