Home > Back-end >  Creating a new column based on if-elif-else condition with np.where in python
Creating a new column based on if-elif-else condition with np.where in python

Time:06-25

I have a DataFrame df: "merged"

Quantity Total Price Rate Rate1
2000000 15 14.5
I want to create a new column based on the following criteria:

1-) if 0 > row A(Quantity Total Price) <= 50000000 :row C(Rate1) will be same with row B(Rate) with another result column


2-) if 50000000 > row A(Quantity Total Price) <= 500000000 :row C(Rate1) will be calculated like another result column >>>> merged['Rate']*0.9   merged['Rate1']*0.1

3-) if 500000000 > row A(Quantity Total Price) <= 2000000000 :row C(Rate1) will be calculated like another result column >>>> merged['Rate']*0.8   merged['Rate1']*0.2

4-) if 2000000000 > row A(Quantity Total Price) <= 4000000000 :row C(Rate1) will be calculated like another result column >>>> merged['Rate']*0.5   merged['Rate1']*0.5

5-) if 4000000000 > row A(Quantity Total Price) <= 6000000000 :row C(Rate1) will be calculated like another result column >>>> merged['Rate']*0.25   merged['Rate1']*0.75


6-)if 6000000000 > row A(Quantity Total Price) <= 99999999999999 :row C(Rate1) will be stay same at another result column.

  

My expected output (example for all condition)

İf Quantity Total First suppose first result: 20.000.000

İf Quantity Total First suppose second result: 100.000.000

İf Quantity Total First suppose third result: 700.000.000

İf Quantity Total First suppose fourth result: 3.000.000.000

İf Quantity Total First suppose fifth result: 5.000.000.000

İf Quantity Total First suppose sixth result: 7.000.000.000

Result
15
14.95
14.9
14.75
14.625
14.5

For typical if else cases I do np.where but I take a error like ValueError: Length of values (5) does not match length of index (1)

My code; 

merged['Rate1'] = np.where(
    [merged['Quantity Total First'] <= 500000000,
     (merged["Quantity Total First"] >= 50000000) & (merged["Quantity Total First"] <= 500000000),
     (merged["Quantity Total First"] >= 500000000) & (merged["Quantity Total First"] <= 2000000000),
     (merged["Quantity Total First"] >= 2000000000) & (merged["Quantity Total First"] <= 4000000000),
     (merged["Quantity Total First"] >= 4000000000) & (merged["Quantity Total First"] <= 6000000000),
     ],
    [merged['Rate'],
     merged['Rate']*0.9   merged['Rate1']*0.1,
     merged['Rate']*0.8   merged['Rate1']*0.2,
     merged['Rate']*0.5   merged['Rate1']*0.5,
     merged['Rate']*0.25   merged['Rate1']*0.75
     ],
    data_state2['Rate1']
)
  

Can you pls help me? You can coding from the beginning. Thnx

CodePudding user response:

You can use pandas.cut to map your values.

NB. for clarity, I divided all values by 1 million.

# list of bins used to categorize the values
bins = [0, 50, 500, 2000, 4000, 6000, float('inf')]
# matching factors to map ]0-50] -> 1 ; ]50-500] -> 0.9, etc.
factors = [1, 0.9, 0.8, 0.5, 0.25, 0]

# get the factors and convert to float
f = pd.cut(df['Quantity Total Price'], bins=bins, labels=factors).astype(float)

# use the factors in numerical operation
df['Result'] = df['Rate']*f df['Rate1']*(1-f)

output:

   Quantity Total Price  Rate  Rate1  Result
0                    20    15   14.5  15.000
1                   100    15   14.5  14.950
2                   700    15   14.5  14.900
3                  3000    15   14.5  14.750
4                  5000    15   14.5  14.625
5                  7000    15   14.5  14.500

used input:

df = pd.DataFrame({'Quantity Total Price': [20, 100, 700, 3000, 5000, 7000],
                   'Rate': [15]*6, 'Rate1': [14.5]*6})
  • Related