Need help adding a new column to the existing DF by summing up two columns.
Here is the DF that I have:
How do I sum salay_1 and salay_2, and include the summed value in a new column if one of the row value is 0.00. I would like to throw an error if both columns (salay_1 and salay_2) have values greater than 0.00
e.g
Scenarios that could encounter
Hen's Salary_1 =0 , and Salary_2 =0 -> New col = 0.00
Hen's Salary_1 =0 , and Salary_2 =10 -> New col = 10.00
Hen's Salary_1 =20 , and Salary_2 =0 -> New col = 20.00
Hen's Salary_1 =20 , and Salary_2 =5 -> throw an error
CodePudding user response:
You can use a test to check whether there is more than one non-zero value.
Best is not to trigger an error but to use a replacement value in case of error (here -1):
m = df.filter(like='salary').ne(0).sum(1).le(1)
df['new_col'] = df.filter(like='salary').sum(1).where(m, -1)
output:
salary1 salary2 new_col
0 0 0 0
1 10 0 10
2 0 100 100
3 10 10 -1
If really you want an error, use:
m = df.filter(like='salary').ne(0).sum(1).le(1)
assert not m.any()
CodePudding user response:
You can apply
a function that checks the conditions (both equal to zero, both larger than zero, and only one larger than zero).
df = pd.DataFrame(data=[['Hen', 0, 0], ['Billy', 10, 0], ['Don', 0, 100], ['Mason', 0, 200], ['Bob', 20, 0]], columns=['name', 'salary_1', 'salary_2'])
def my_func(s):
if s['salary_1'] == 0 and s['salary_2'] == 0:
return 0
elif s['salary_1'] > 0 and s['salary_2'] > 0:
return "Error"
else:
return s['salary_1'] s['salary_2']
df['new_col'] = df.apply(my_func, axis=1)
output:
name salary_1 salary_2 new_col
0 Hen 0 0 0
1 Billy 10 0 10
2 Don 0 100 100
3 Mason 0 200 200
4 Bob 20 0 20
CodePudding user response:
for idx, rows in df.iterrows():
if df.sal_1[idx] > 0 and df.sal_2[idx] > 0:
raise Exception("Sorry, no numbers below zero")
else:
df["new_col"][idx] = df.sal_1[idx] df.sal_2[idx]
I think this works. Throwing an error should may not be the best case. Therefore, you may prefer to fill it by -1 instead.
...
if df.sal_1[idx] > 0 and df.sal_2[idx] > 0:
df["new_col"][idx] = -1
...
CodePudding user response:
this works too :
sum_column = df["salary_1"] df["salary_2"] df["new_col"] = sum_column
but I am not sure how I am going to thrown an error if both slary1 and salary_2 have values grater than 0.00