I have the following dataframe which list certain transactions
df1
ProductId Code Cost
1 101 £1000
1 101 £2000
2 101 £1500
2 101 £1000
However I need to change the data to a lower level of data. And the information of this is found in df2
df2
ProductId Code2 Percent
1 201 0.25
1 301 0.75
2 201 0.60
2 301 0.40
Therefore, the resulting dataframe should look like this:
df1
ProductId Code Cost
1 201 £250
1 301 £750
1 201 £500
1 301 £1500
2 201 £900
2 301 £600
2 201 £600
2 301 £400
The resulting dataframe will need have the rows replicate to accomodate the lower level code. (I.e every ProductId = 1 will have code 101 broken into 201 and 301 with the costs being adjust per the percentage change).
I have started off by replicating the rows using pandas concat:
df_new = pd.concat([df1]*2, ignore_index=True)
However, I am struggling to figure a way to effectively change the data to:
- Change the code (to 201 and 301 for each ProductId)
- Change the cost (as per the percentage)
Any guidance would be greatly appreciated. Thanks
CodePudding user response:
You can do merge
out = df1.merge(df2,on = 'ProductId').assign(Cost = lambda x : x['Cost']*x['Percent'])
Out[509]:
ProductId Code Cost Code2 Percent
0 1 101 250.0 201 0.25
1 1 101 750.0 301 0.75
2 1 101 500.0 201 0.25
3 1 101 1500.0 301 0.75
4 2 101 900.0 201 0.60
5 2 101 600.0 301 0.40
6 2 101 600.0 201 0.60
7 2 101 400.0 301 0.40