I have this code that allows me to find all blanks in certain columns of a dataframe. Here is what I have:
req_cols = ['First Name*','Last Name*','Country*','Company*','Email Address*']
bad_nan=df[df[req_cols].isna().any(1)]
I am trying to add the missing cells or NAN values to an existing dictionary called "errors"
if not bad_nan.empty:
errors.append({
"row": [0],
"column": [1],
"message": "This is a required field"
})
this is what the dictionary looks like:
{'row': [0], 'column': [1], 'message': 'This is a required field'}
but I would like it to look like
{'row': 2, 'column': First Name*, 'message': 'This is a required field'}
I would like this to display all the cells that have a NaN value not just one
CodePudding user response:
It seems to me that you forgot something in your if statement.
if not bad_nan.empty:
errors.append({
"row": bad_nan[0],
"column": bad_nan[1],
"message": "This is a required field"
})
Could that be the problem?
CodePudding user response:
I would loop through every cell in the dataframe bad_nan
to append the row index and column name to the errors
list. Please check if np.nan is still the same in the filtered bad_nan
errors = []
req_cols = ['First Name*','Last Name*','Country*','Company*','Email Address*']
#req_cols = df.columns
bad_nan = df.loc[df[req_cols].isna().any(1)]
for col in bad_nan.columns:
bad_nan[col] = bad_nan[col].astype('str')
for i in range(bad_nan.shape[0]):
#print(bad_nan.loc[i, col])
if bad_nan.loc[i, col] == 'nan': #use print statement to check if string is 'nan'
errors.append({ "row": i,
"column": col,
"message": "This is a required field" })
print(errors)