I have a double nested dictionary that I would like to write more information to but when I try to write more information it overwrites everything.
data= {'First Name': ['Sally', 'Bob', 'Sue', 'Tom', 'Will'],
'Last Name': ['William', '', 'Wright', 'Smith','Thomas'],
'Email Address':['[email protected]','[email protected]','[email protected]','[email protected]',''],
'Industry': ['Automotive','Gas', 'Healthcare', 'Other', 'Biotech / Pharma'],
'SME Vertical': ['Education', 'hotels', '', 'project management and design',''],
'System Type': ['Access','Access','video Systems','Access','Access'],
'Account Type': ['Commercial', '','Reseller','','Small']}
df=pd.DataFrame(data)
df1= df[["Industry",'System Type','Account Type', 'SME Vertical']]
req_cols= ['First Name','Last Name','Email Address']
errors= {}
errors[filename]={}
Here is my code for my first loop that reads an external JSON file and returns all the errors to the dictionary 'errors', this part works great.
mask = df1.apply(lambda c: c.isin(valid[c.name]))
df1.mask(mask|df1.eq(' ')).stack()
for err_i, (r, v) in enumerate(df1.mask(mask|df1.eq(' ')).stack().iteritems()):
errors[filename][err_i] = {"row": r[0],
"column": r[1],
"message": v " is invalid"}
Its output looks something like this:
key Type Size Value
Data Template dict 6 {'row': 1, 'column': 'Industry', 'message': 'gas is invalid'}
{'row': 1, 'column': 'SME Vertical', 'message': 'hotels is invalid'}
{'row': 2, 'column': 'Industry', 'message': 'healthcare is invalid'}
{'row': 3, 'column': 'Industry', 'message': 'other is invalid'}
{'row': 3, 'column': 'SME Vertical', 'message': 'project management and design is invalid'}
{'row': 4, 'column': 'Account Type', 'message': 'small is invalid'}
I would like to add this piece of code to write more errors into that nest dictionary above. The code finds the Nan and blanks in the req_cols:
bad_nan = df.loc[df[req_cols].isna().any(1)]
bad_nan=bad_nan.fillna(value='NaN' )
for col in bad_nan.columns:
for i in bad_nan.index:
if bad_nan.loc[i, col] == 'NaN':
errors[filename]={ "row": i,
"column": col,
"message": "This is a required field" }
its output overwrites all the existing data in the nested dictionary. How do I just add more to the nested dictionary? I would like to add all the invalid format and require field errors to the same dictionary so it looks something like:
key Type Size Value
Data Template dict 6 {'row': 1, 'column': 'Industry', 'message': 'gas is invalid'}
{'row': 1, 'column': 'SME Vertical', 'message': 'hotels is invalid'}
{'row': 2, 'column': 'Industry', 'message': 'healthcare is invalid'}
{'row': 3, 'column': 'Industry', 'message': 'other is invalid'}
{'row': 3, 'column': 'SME Vertical', 'message': 'project management and design is invalid'}
{'row': 4, 'column': 'Account Type', 'message': 'small is invalid'}
{'row': 2, 'column' : 'Last Name', 'message': 'this is a required field'}
{'row': 5, 'column' : 'Email Address', 'message': 'this is a required field'}
CodePudding user response:
Your printouts only showed the values in your errors dictionary, I think. When I print using
print('\n'.join(map(str, errors[filename].items())))
I see that the dictionary is keyed by error number. For example:
(0, {'row': 0, 'column': 'SME Vertical', 'message': 'Education is invalid'})
(1, {'row': 1, 'column': 'Industry', 'message': 'Gas is invalid'})
(2, {'row': 1, 'column': 'Account Type', 'message': ' is invalid'})
...
This makes sense to me based on the code in your first loop:
errors[filename][err_i] = {"row": r[0],
"column": r[1],
"message": v " is invalid"}
Note the use of the key, err_i
.
I submit you want your second loop to use an error index as well, to assign values to errors[filename][err_i]
rather than errors[filename]
. Perhaps something like this:
if bad_nan.loc[i, col] == 'NaN':
errors[filename][err_i]={ "row": i,
"column": col,
"message": "This is a required field" }
err_i = 1