I have searched all around the internet and tried many methods before making this post, I have a dataframe where I want to:
- Replace NaN value of TGT_COLUMN_SCALE to 0 If TGT_COLUMN_DATA_TYPE is equals to NUMERIC.
Kindly help me out with this issue.
I tried this code but it's not working:
df["TGT_COLUMN_SCALE"] = np.where(df["TGT_COLUMN_DATA_TYPE"] == "NUMERIC", 'NaN', 0)
CodePudding user response:
Sample:
df = pd.DataFrame({
"TGT_COLUMN_DATA_TYPE" : ["DATE", "NUMERIC", "STRING", "NUMERIC"],
"TGT_COLUMN_SCALE" : [np.NaN, np.NaN, 4.0, 5.0]
})
Replace
df.loc[(df.TGT_COLUMN_DATA_TYPE == "NUMERIC") & (df.TGT_COLUMN_SCALE.isnull()), "TGT_COLUMN_SCALE"] = 0
Result:
TGT_COLUMN_DATA_TYPE TGT_COLUMN_SCALE
0 DATE NaN
1 NUMERIC 0.0
2 STRING 4.0
3 NUMERIC 5.0
CodePudding user response:
You just need to use loc to select the columns and then you use fillna to replace values:
df.loc[df.TGT_COLUMN_SCALE == "NUMERIC",
"TGT_COLUMN_DATA_TYPE"] = df.loc[df.TGT_COLUMN_SCALE == "NUMERIC", "TGT_COLUMN_DATA_TYPE"].fillna(0)
Full code
TGT_COLUMN_SCALE = ('DATE', 'TIMESTAMP', 'NUMERIC', 'NUMERIC')
TGT_COLUMN_DATA_TYPE = (np.nan, np.nan, np.nan, np.nan)
df = pd.DataFrame(list(zip(TGT_COLUMN_SCALE, TGT_COLUMN_DATA_TYPE)),
columns=['TGT_COLUMN_SCALE', 'TGT_COLUMN_DATA_TYPE'])
df.loc[df.TGT_COLUMN_SCALE == "NUMERIC",
"TGT_COLUMN_DATA_TYPE"] = df.loc[df.TGT_COLUMN_SCALE == "NUMERIC", "TGT_COLUMN_DATA_TYPE"].fillna(0)
CodePudding user response:
np.where
will take the first option as the value in case the condition is true, else the second. You need to replace the order of nan
and 0
df["TGT_COLUMN_SCALE"] = np.where((df["TGT_COLUMN_DATA_TYPE"] == "NUMERIC") & (df["TGT_COLUMN_SCALE"].isnull()), 0, df["TGT_COLUMN_SCALE"])