Home > Net >  Panda DF - Sum and combine two column into a new column
Panda DF - Sum and combine two column into a new column

Time:09-01

Need help adding a new column to the existing DF by summing up two columns.

Here is the DF that I have:

enter image description here

How do I sum salay_1 and salay_2, and include the summed value in a new column if one of the row value is 0.00. I would like to throw an error if both columns (salay_1 and salay_2) have values greater than 0.00

enter image description here

e.g

Scenarios that could encounter

Hen's Salary_1 =0 , and Salary_2 =0 -> New col = 0.00

Hen's Salary_1 =0 , and Salary_2 =10 -> New col = 10.00

Hen's Salary_1 =20 , and Salary_2 =0 -> New col = 20.00

Hen's Salary_1 =20 , and Salary_2 =5 -> throw an error

CodePudding user response:

You can use a test to check whether there is more than one non-zero value.

Best is not to trigger an error but to use a replacement value in case of error (here -1):

m = df.filter(like='salary').ne(0).sum(1).le(1)
df['new_col'] = df.filter(like='salary').sum(1).where(m, -1)

output:

   salary1  salary2  new_col
0        0        0        0
1       10        0       10
2        0      100      100
3       10       10       -1

If really you want an error, use:

m = df.filter(like='salary').ne(0).sum(1).le(1)
assert not m.any()

CodePudding user response:

You can apply a function that checks the conditions (both equal to zero, both larger than zero, and only one larger than zero).

df = pd.DataFrame(data=[['Hen', 0, 0], ['Billy', 10, 0], ['Don', 0, 100], ['Mason', 0, 200], ['Bob', 20, 0]], columns=['name', 'salary_1', 'salary_2'])

def my_func(s):
    if s['salary_1'] == 0 and s['salary_2'] == 0:
        return 0
    elif s['salary_1'] > 0 and s['salary_2'] > 0:
        return "Error"
    else:
        return s['salary_1']   s['salary_2']

df['new_col'] = df.apply(my_func, axis=1)

output:

        name        salary_1    salary_2    new_col
0       Hen         0           0           0
1       Billy       10          0           10
2       Don         0           100         100
3       Mason       0           200         200
4       Bob         20          0           20

CodePudding user response:

for idx, rows in df.iterrows():
    if df.sal_1[idx] > 0 and df.sal_2[idx] > 0:
        raise Exception("Sorry, no numbers below zero")
    
    else:
        df["new_col"][idx] = df.sal_1[idx]   df.sal_2[idx]

I think this works. Throwing an error should may not be the best case. Therefore, you may prefer to fill it by -1 instead.

...
    if df.sal_1[idx] > 0 and df.sal_2[idx] > 0:
            df["new_col"][idx] = -1
...

CodePudding user response:

this works too :

sum_column = df["salary_1"] df["salary_2"] df["new_col"] = sum_column

but I am not sure how I am going to thrown an error if both slary1 and salary_2 have values grater than 0.00

  • Related