I have a pandas DataFrame (df_A) with this basic form:
|id| alt| a | b | c | d | e |
|--|----|-----|-----|-----|---|---|
| 0| ICV| 0.2 | 1.0 | 0.2 | 0 | 1 |
| 1| ICV| 1.0 | 1.0 | 0.2 | 0 | 0 |
| 2| BEV| 3.2 | 1.0 | 0.2 | 1 | 0 |
| 3| ICV| 2.0 | 1.0 | 0.2 | 0 | 0 |
| 4| BEV| 2.0 | 1.0 | 0.2 | 1 | 1 |
Furthermore I have another DataFrame (df_B):
|id| alt| a | b | c |
|--|----|-----|-----|-----|
| 0| ICV| 0.1 | 0.3 | 0.5 |
| 1| BEV| 0.2 | 0.4 | 0.6 |
What I want to do is multiply the values of the second DataFrame with the values of the first, where the alt value is the same. I also do not want the d or e columns to be involved in the multiplication. So I want a DataFrame (df_C) like this:
|id| alt| a | b | c | d | e |
|--|----|---------|---------|---------|---|---|
| 0| ICV| 0.2*0.1 | 1.0*0.3 | 0.2*0.5 | 0 | 1 |
| 1| ICV| 1.0*0.1 | 1.0*0.3 | 0.2*0.5 | 0 | 0 |
| 2| BEV| 3.2*0.2 | 1.0*0.4 | 0.2*0.6 | 1 | 0 |
| 3| ICV| 2.0*0.1 | 1.0*0.3 | 0.2*0.5 | 0 | 0 |
| 4| BEV| 2.0*0.2 | 1.0*0.4 | 0.2*0.6 | 1 | 1 |
I have tried a few thing unsuccessfully:
list = ["a","b","c"]
df_C = df_A.copy()
for i in ["BEV","ICV"]:
df_C[list].loc[df_C["alt"]==i] = df_A[list ["alt"]].loc[df_A["alt"]==i].drop(["alt"],axis=1).mul(df_B[df_B["alt"]==i].drop(["alt"],axis=1).to_numpy(),axis=1)
This just returns the original DataFrame (df_A). I know my approach is deeply flawed but I cannot think of any way to do this.
CodePudding user response:
You can use a merge
and in place multiplication:
cols = ['a', 'b', 'c']
df_A[cols] *= df_A[['alt']].merge(df_B, how='left')[cols]
output:
id alt a b c d e
0 0 ICV 0.02 0.3 0.10 0 1
1 1 ICV 0.10 0.3 0.10 0 0
2 2 BEV 0.64 0.4 0.12 1 0
3 3 ICV 0.20 0.3 0.10 0 0
4 4 BEV 0.40 0.4 0.12 1 1
intermediate:
df_A[['alt']].merge(df_B, how='left')[cols]
a b c
0 0.1 0.3 0.5
1 0.1 0.3 0.5
2 0.2 0.4 0.6
3 0.1 0.3 0.5
4 0.2 0.4 0.6