Hi I would like to transform my numeric variable that If it exceeds 1,000 value then it should be null or NA. Otherwise still use the value. Below is my code.
df['PREMIUM'] = pd.to_numeric( df["PREMIUM"])
df['PREMIUM_V2'] = np.where(df['PREMIUM']>1000,np.NaN,df['PREMIUM'])
I tried this but it makes my PREMIUM_V2 not a numeric value. It became just an object.
CodePudding user response:
Use mask
:
df = pd.DataFrame({'PREMIUM': [0,1,100,10000]})
df['PREMIUM2'] = df['PREMIUM'].mask(df['PREMIUM'].gt(1000))
output:
PREMIUM PREMIUM2
0 0 0.0
1 1 1.0
2 100 100.0
3 10000 NaN
CodePudding user response:
I cant understand your question if you want to change the value in the column df['PREMIUM'] to NaN if the value greater than 1000 :
df['PREMIUM'] = pd.to_numeric( df["PREMIUM"])
df['PREMIUM'] = np.where(df['PREMIUM']>1000,df['PREMIUM'],np.NaN)
if you want to create a different column in the dataframe and keep the less than 1000 value as it is and change the value grater than 1000 as Nan you can use :
df['PREMIUM'] = pd.to_numeric( df["PREMIUM"])
df['PREMIUM_V2'] = np.where(df['PREMIUM']>1000,df['PREMIUM'],np.NaN)
note : numpy.where(condition, [dataframe], value)