I have two dataframes:
data1 = {'Item': ['A', 'B', 'C', 'N'], 'Price': [1, 2, 3, 10], 'Category': ['X', 'Y', 'X', 'Z'], 'County': ['K', 'L', 'L', 'K']}
df1 = pd.DataFrame(data1)
df1
data2 = {'Category': ['X', 'Y', 'Z'], 'Value retained': [0.1, 0.2, 0.8]}
df2 = pd.DataFrame(data2)
df2
How do I multiply 'Value retained' by 'Price' following their respective Category and add the result as a new column in df1?
I've searched a lot for a solution and tried several different things, among them:
df3 = df1
for cat, VR in df2['Category', 'Value retained']:
if cat in df1.columns:
df3[cat] = df1['Price'] * VR
and
df3 = df1['Price'] * df2.set_index('Category')['Value retained']
df3
In my real dataframe I have 250k items and 32 categories with different values of 'value retained'.
I really appreciate any help for a newbie in Python coding.
CodePudding user response:
You can use this,
import pandas as pd
data1 = {'Item': ['A', 'B', 'C', 'N'], 'Price': [1, 2, 3, 10], 'Category': ['X', 'Y', 'X', 'Z'], 'County': ['K', 'L', 'L', 'K']}
df1 = pd.DataFrame(data1)
data2 = {'Category': ['X', 'Y', 'Z'], 'Value_retained': [0.1, 0.2, 0.8]}
df2 = pd.DataFrame(data2)
df = df1.merge(df2, how='left')
df['Values'] = df.Price * df.Value_retained
print(df)
The output is,
Item Price Category County Value_retained Values
0 A 1 X K 0.1 0.1
1 B 2 Y L 0.2 0.4
2 C 3 X L 0.1 0.3
3 N 10 Z K 0.8 8.0
CodePudding user response:
You're 2nd approach would work if both dataframes have Category
as index, but since you can't set_index
on Category
in df1
(because you have duplicated entries) you need to do a left merge on the two df based on the column Category
and then multiply.
df3 = df1.merge(df2, on='Category', how='left')
df3['result'] = df3['Price'] * df3['Value retained']
print(df3)
Item Price Category County Value retained result
0 A 1 X K 0.1 0.1
1 B 2 Y L 0.2 0.4
2 C 3 X L 0.1 0.3
3 N 10 Z K 0.8 8.0