I have a dataframe like the below:
df:
PAN_NO COST_VALUE
AAA -0.001
BBB 2938080
CCC 49224091
DDD 100
EEE 50236272.32
I am trying to create a new column based on the below condition:
If df['cost_value'] >=0.001 and df['cost_value'] <= 299985.0 then cost_value_group should be 1
If df['cost_value'] > 299985.0 and df['cost_value'] <= 2938082.40 then cost_value_group should be 2
If df['cost_value'] > 2938082.40 and df['cost_value'] <= 17399130.0 then cost_value_group should be 3
If df['cost_value'] > 2938082.40 and df['cost_value'] <= 17399130.0 then cost_value_group should be 3
If df['cost_value'] > 17399130.0 and df['cost_value'] <= 49224091.375 then cost_value_group should be 4
If df['cost_value'] > 49224091.375 cost_value_group should be 5
Else it should be 6
EXPECTED OUTPUT:
PAN_NO COST_VALUE COST_VALUE_Group
AAA -0.001 1
BBB 2938080 2
CCC 49224091 5
DDD 100 1
EEE 50236272.32 6
I tried doing :
def cost_value(x):
if df['cost_value'] >= -0.001 and df['cost_value'] <= 299985.0:
return 1
elif df['cost_value'] > 299985.0 and df['cost_value'] <= 2938082.40:
return 2
elif df['cost_value'] > 2938082.40 and df['cost_value'] <= 17399130.0:
return 3
elif df['cost_value'] > 17399130.0 and df['cost_value'] <= 49224091.375:
return 4
elif df['cost_value'] > 49224091.375:
return 5
else:
return 6
df['cost_value_group] = df['cost_value].apply(cost_value)
I am getting a value error that the true value of a series is ambiguous.
Can someone please assist me in this.
CodePudding user response:
You are on the right path.
Try this:
df['cost_value_group'] = df['cost_value'].apply(lambda x: cost_value(x))