I am using for loop in python and every loop creates a dictionary. I have the below set of dictionaries created.

{'name': 'xxxx'}
{'name': 'yyyy','age':'28'}
{'name': 'zzzz','age':'27','sex':'F'}

My requirement is to compare all the dictionaries created and find out the missing key values and add the key to missing dictionaries and order every dictionary based on key. Below is the expected output

Expected output:

{'age':'','name': 'xxxx','sex':''}
{'age':'28','name': 'yyyy','sex':''}
{'age':'27','name': 'zzzz','sex':'F'}

How to achieve this in python.

CodePudding user response：

If you want to modify the dicts in-place, dict.setdefault would be easy enough.

my_dicts = [
  {'name': 'xxxx'},
  {'name': 'yyyy','age':'28'},
  {'name': 'zzzz','age':'27','sex':'F'},
]
desired_keys = ['name', 'age', 'sex']
for d in my_dicts:
    for key in desired_keys:
        d.setdefault(key, "")

print(my_dicts)

prints out

[
  {'name': 'xxxx', 'age': '', 'sex': ''}, 
  {'name': 'yyyy', 'age': '28', 'sex': ''}, 
  {'name': 'zzzz', 'age': '27', 'sex': 'F'},
]

If you don't want to hard-code the desired_keys list, you can make it a set and gather it from the dicts before the loop above.

desired_keys = set()
for d in my_dicts:
    desired_keys.update(set(d))  # update with keys from `d`

Another option, if you want new dicts instead of modifying them in place, is

desired_keys = ...  # whichever method you like
empty_dict = dict.fromkeys(desired_keys, "")
new_dicts = [{**empty_dict, **d} for d in my_dicts]

EDIT based on comments:

This doesn't remove keys that are not there in desired keys.

This will leave only the desired keys:

desired_keys = ...  # Must be a set
for d in my_dicts:
    for key in desired_keys:
        d.setdefault(key, "")
    for key in set(d) - desired_keys:
        d.pop(key)

However, at that point it might be easier to just create new dicts:

new_dicts = [
    {key: d.get(value, "") for key in desired_keys}
    for d in my_dicts
]

CodePudding user response：

data = [{'name': 'xxxx'},
{'name': 'yyyy','age':'28'},
{'name': 'zzzz','age':'27','sex':'F'}]

First get the maximum, to get all the keys. Then use dict.get to get default value as empty string for each of the keys, and sort the dictionary on key, you can combine List-comprehension and dict-comprehension:

allKD = max(data, key=len)
[dict(sorted({k:d.get(k, '') for k in allKD}.items(), key=lambda x:x[0])) for d in data]

OUTPUT:

[{'age': '', 'name': 'xxxx', 'sex': ''},
 {'age': '28', 'name': 'yyyy', 'sex': ''},
 {'age': '27', 'name': 'zzzz', 'sex': 'F'}]

CodePudding user response：

One approach:

from operator import or_
from functools import reduce

lst = [{'name': 'xxxx'},
       {'name': 'yyyy', 'age': '28'},
       {'name': 'zzzz', 'age': '27', 'sex': 'F'}]

# find all the keys
keys = reduce(or_, map(dict.keys, lst))

# update each dictionary with the complement of the keys
for d in lst:
    d.update(dict.fromkeys(keys - d.keys(), ""))

print(lst)

Output

[{'name': 'xxxx', 'age': '', 'sex': ''}, {'name': 'yyyy', 'age': '28', 'sex': ''}, {'name': 'zzzz', 'age': '27', 'sex': 'F'}]