Create new column based on values in each row in another column-CodePudding

I was wondering if someone would be able to help me with the following. I have this dataframe:

df = {'Price': [60.50,5.20,7,20.16,73.50,12.55,8.70,6,54.10,89.40,12,55.50,6,120,13.20], 
      'Discount': [1,1,1,0.5,0.4,1,0.3,0.2,1,1,1,1,1,0.1,0.9]}
df = pd.DataFrame(data=df)

What I am trying to do is iterate through each row in the df and if the Discount amount is 1, then populate the Amount off for that row with 0. Otherwise, it would calculate the amount it has been taken off according to the discount. I have the following code to do this:

for index, row in df.iterrows():
    if row['Discount'] == 1:
        df['Amount off'] = 0
    else:
        df['Amount off'] = df['Price']*df['Discount']

However, once I run it, it does not produce the correct amount off when the discount is 1. Is anyone able to give me some hints and directions on where I am going wrong please?

Thank you in advance!

CodePudding user response：

Your mistake is, that you are manipulationg the full dataframe when calling e.g.

df['Amount off'] = 0

Thus after executing this line the total column is 0. Dependent on the last row you will end up with either only 0 or df['Price']*df['Discount']

You can use:

df['Amount off']=df['Price']*df['Discount']
df.loc[df['Discount']==1, 'Amount off']=0
df.head()

First, I calculate the full column and then only change the values where df['Discount']==1

CodePudding user response：

Hi as Jacob put it well you need to rewrite your script as according below;

for index, row in df.iterrows():
   df['Amount off'] = df['Price']*df['Discount']
   df.loc[df['Discount'] == 1, 'Amount off'] = 0