I am working on this project where I need to fill a NaN
cell from the reason
column, but I need to cross-check with the status
column first and fill in the appropriate values.
The condition is:
- If
STATUS
is "Invoiced" andREASON
isNaN
, then fill with "Not Applicable" - If
STATUS
is "Lost Sale" andREASON
isNaN
, then fill with "Unknown"
I have tried this code:
data2 = np.where((data["SO_STATUS"] == "Invoiced")&(data["SO_LOSREASON"]=='NaN'),'null',"Not Applicable")
data2 = pd.DataFrame(data2, columns = ['SO_STATUS'])
data= data.join(data2)
However, it says:
ValueError: columns overlap but no suffix specified: Index(['SO_STATUS'], dtype='object')
I am not sure if what I did is correct or not since I still need the other data from reason
column that are not NaN
.
Thank you so much for helping!
CodePudding user response:
Try using direct column assignment:
data['SO_STATUS'] = np.where((data["SO_STATUS"] == "Invoiced") & data["SO_LOSREASON"].isna(),, "Not Applicable", "Unknown")
CodePudding user response:
You can use map
and where
, thus it is easy to add any other condition:
data['new_column'] = (data['SO_STATUS'].map({'Invoiced': 'Not Applicable',
'Lost Sale': 'Unknown',
# add other here: 'Other': 'other',
})
.where(data['SO_LOSREASON'].isna())
)
example (setting a new column new_column
for clarity):
SO_STATUS SO_LOSREASON new_column
0 Invoiced NaN Not Applicable
1 Lost Sale NaN Unknown
2 Other NaN NaN
3 Invoiced abc NaN