Problem with pandas
I have a following dataframe:
product code trends_vs trends_100_0
1 aaaaaaa it 150 100
6 aaaaaaa pl <NA> 90
11 aaaaaaa ro <NA> 82
15 aaaaaaa sk <NA> 13
21 bbbbb ro 260 100
25 bbbbb cz <NA> 62
29 bbbbb sk <NA> 42
I want to calculate values from trends_vs column the following way:
each product in a ‘product’ column have ONLY one trends_vs value which corresponds with trends_100_0 - 100 value.
And now to get trends_vs value for second row we have to do 150 * (90/100)
The output and logic should look like this:
product code trends_vs trends_100_0
1 aaaaaaa it 150 100
6 aaaaaaa pl 135 90 ---> 150 * (90/100)
11 aaaaaaa ro 123 82 ---> 150 * (82/100)
15 aaaaaaa sk 19.5 13 ---> 150 * (13/100)
21 bbbbb ro 260 100
25 bbbbb cz 161.2 62 ---> 260 * (62/100) # 260 instead of 150 because its a different product 'bbbbb'
29 bbbbb sk 109.2 42 ---> 260 * (42/100)
I tried with groupby or map but something not work as expected. I have a solution when each column should corresponds with each other in a row but in this scenario where we should calculate based on one main value for each product?
CodePudding user response:
Another method without groupby
:
df['trends_vs'] = (df.sort_values(['product', 'trends_100_0'])['trends_vs'].bfill()
* df['trends_100_0'] / 100)
print(df)
# Output
product code trends_vs trends_100_0
1 aaaaaaa it 150.0 100
6 aaaaaaa pl 135.0 90
11 aaaaaaa ro 123.0 82
15 aaaaaaa sk 19.5 13
21 bbbbb ro 260.0 100
25 bbbbb cz 161.2 62
29 bbbbb sk 109.2 42
CodePudding user response:
use .groupby.transform
and then a simple .mul
with .div
we can use .fillna
to target your rows.
df["trends_vs"] = df["trends_vs"].fillna(
df.groupby("product")["trends_vs"]
.transform("min")
.mul((df["trends_100_0"].div(100)))
)
product code trends_vs trends_100_0
1 aaaaaaa it 150.0 100
6 aaaaaaa pl 135.0 90
11 aaaaaaa ro 123.0 82
15 aaaaaaa sk 19.5 13
21 bbbbb ro 260.0 100
25 bbbbb cz 161.2 62
29 bbbbb sk 109.2 4
CodePudding user response:
This will allow you to find the trends_vs based on product and the trends_100_0 based on the results
df['trends_vs'] = df.groupby('product')['trends_vs'].transform(max)
df['trends_vs'] = df['trends_vs'] * (df['trends_100_0']/100)
df