Essentially I am working with a dataframe and I am trying to multiply by 2 different conditions. If the value in order description == Internet Port Charge It needs to be multiplied in the amount coloumn by .33 and if not then by 1.9. I keep getting a value error. Thank you!
for x in max_sales:
if max_sales['Order description'] == 'Internet Port Charge':
max_sales['amount'] * .33
else:
max_sales['amount'] * 111.9
1 for x in max_sales:
----> 2 if max_sales['Order description'] == 'Internet Port Charge':
3 max_sales['amount'] * .33
4 else:
5 max_sales['amount'] * 111.9
~\Anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
1535 @final
1536 def __nonzero__(self):
-> 1537 raise ValueError(
1538 f"The truth value of a {type(self).__name__} is ambiguous. "
1539 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
CodePudding user response:
If you just have these conditions, then .loc the parts that you want to multiply by an amount, then assign them that amount:
max_sales.loc[max_sales['Order description'] == 'Internet Port Charge']['amount'] = max_sales.loc[max_sales['Order description'] == 'Internet Port Charge']['amount']*0.33
max_sales.loc[~(max_sales['Order description'] == 'Internet Port Charge')]['amount'] = max_sales.loc[~(max_sales['Order description'] == 'Internet Port Charge')]['amount']*1.9
I don't see what the for x in max_sales
is supposed to do, seeing as x isn't used again later.
CodePudding user response:
You could use NumPy's .where()
:
import numpy as np
max_sales['amount'] = np.where(
max_sales['Order description'] == 'Internet Port Charge',
max_sales['amount'] * .33,
max_sales['amount'] * 111.9
)
This looks for rows where the condition is met and multiplies those values by 0.33. Where the condition is False, it multiplies by 111.9. It's also significantly faster (and cleaner) than iterating over the DataFrame.
CodePudding user response:
for index, row in df.iterrows():
if row['Order description'] == 'Internet Port Charge':
row['amount'] = row['amount'] * 0.33
else:
row['amount'] = row['amount'] * 111.9
You must loop through the DataFrame using .iterrows() then you can access each row individually
.iterrows() is very resource intensive though.
CodePudding user response:
You can use apply
and lambda
:
import pandas as pd
# Set up dummy data
df = [
["Internet Port Change", 20],
["Foobar", 20]
]
df = pd.DataFrame(df, columns=["Order description", "amount"])
# Order description amount
# 0 Internet Port Change 20
# 1 Foobar 20
# Use apply and lambda
df["amount"] = df.apply(
lambda x: x["amount"]*0.33 if x["Order description"] == "Internet Port Change" \
else x["amount"]*111.9,
axis=1)
# Order description amount
# 0 Internet Port Change 6.6
# 1 Foobar 2238.0