Numpy "where" with multiple conditions add apply pandas dataframe-CodePudding

I try to add a new column "Result" to a dataframe "merged" which it contains the integers merged['Rate1'] and merged['Rate'] if the "Quantity Total" value between 0 and 500 then merged['Rate'] assign for merged['Rate1']. if the "Quantity Total" value is between 500 and 1000,then calculate merged['Rate1']*0.1 merged['Rate']*0.9 for "result" . if the "Quantity Total" value is between 1000 and 1500, then calculate merged['Rate1']*0.2 merged['Rate']*0.8 for "result" . if the "Quantity Total" value is between 1500 and 2000, then calculate merged['Rate1']*0.5 merged['Rate']*0.5 for "result". if the "Quantity Total" value is between 2000 and 2500, then calculate merged['Rate1']*0.25 merged['Rate']*0.75 for "result". if the "Quantity Total" value is between 2500 and 999999999, then it will be used just merged['Rate1']. I mean it will be the same rate for just last case.

I try to use np.where from numpy. Or trying to use dataframe.loc.

Any idea to help me please?

Thank you in advance

My input (1);

merged >>my dataframe is 2 rows. The most important column that I will use in the calculation is Quantity Total. I mean merged['Quantity Total Price']. this column has the number 1000.

Quantity Total Price
1000

My input (2); merged >> Same as the dataframe in item 1 above, but another column. I mean merged['Rate1']. this column has the number 14.5.

Rate1
14.5

My input (3); merged >> merged['Rate']. this column has the number 15.

Rate
15

My expected output;

if merged['Quantity Total'] >= 0 & merged['Quantity Total'] <= 500 then, I want to use merged['Rate'] directly instead of merged['Rate1'] by creating a new column.

For example, Quantity Total First: 300  and according my if statement 300 is between 0 and 500 then, create new column and write "15" I mean merged['Rate']



 merged["Quantity Total"] >= 500) & (merged["Quantity Total"] <= 1000) then,
calculate >> merged['Rate']*0.9   merged['Rate1']*0.1 and write result columns.
 
other cases are listed above.

CodePudding user response：

It is a bit difficult for me to understand exactly what you want to do (this could be due to my non-native english). I would suggest including a mock example of your input and a mock example of the output you want to get for making your question clearer.

From what I can understand however, using apply might be a good option for you, since you have a lot of custom conditions among different columns.

def test_function(row):
    a= getattr(row,"A")
    b= getattr(row,"B")
    
    #test_output = "unforeseen"
    
    if b == "b" and a != "a": 
        test_output = "wrong a"
    if b != "b" and a == "a": 
        test_output = "wrong b"
    if b != "b" and a != "a": 
        test_output = "wrong a & b"
    if b == "b" and a == "a": 
        test_output = "correct a & b"
    return(test_output)
        
test_df = pd.DataFrame(list(zip(["a","a","c","c"],
                                ["b","c","b","c"])),
                       columns = ["A","B"])

test_df.apply(lambda x: test_function(x), axis =1)

CodePudding user response：

share_dict = {0: 0, 500: 0.1, 1000: 0.2, 1500: 0.5, 2000: 0.75, 2500: 1}  # map for Rate1


def rates(key_list, total):  # recursive function to return Rate1 and Rate shares
    key_list = sorted(key_list)
    if total >= key_list[-1]:
        return share_dict[key_list[-1]], 1 - share_dict[key_list[-1]]
    return rates(key_list[:-1], total)


# creating new calculated column with lambda
merged['calculated'] = merged.apply(lambda x: x['Rate1'] * rates(share_dict.keys(), x['Quantity Total'])[0]  
                                              x['Rate'] * rates(share_dict.keys(), x['Quantity Total'])[1], axis=1)

output:

Rate  Rate1  Quantity Total  calculated
2.5    1.5            3527        1.50
10.0   13.0               0       10.00
12.5    7.5            2190        8.75
11.5    0.5             967       10.40
4.5    0.5             516        4.10
9.0    5.5              20        9.00