Home > database >  Is there a Pythonic way to extend one dictionary with another (preserving all values)
Is there a Pythonic way to extend one dictionary with another (preserving all values)

Time:11-27

Given the dictionaries:

d1={'a':'a','b':'b','c':'c'}
d2={'b':'a','c':['d','f','g'],'e':'e'}

Can these two dictionaries be combined in such a way as to merge all common keys, and preserve all values? I.e. gives the output:

> print(d1.extend(d2))

{'a':'a','b':['b','a'],'c':['c','d','f','g'],'e':'e'}

I came up with the following, which seems to work, but is very un-pythonic.

def extend(d1, d2):
    return_dict={}
    for key, value in d1.items():
        if key in d2:
            value_d2=d2[key]
            if value == value_d2:
                continue
            if type(value) == list and type(value_d2) == list:
                value.extend(value_d2)
                return_dict[key]=value
            elif type(value) == list and type(value_d2) != list:
                tmp=[value_d2]
                tmp.extend(value)
                return_dict[key]=tmp
            elif type(value) != list and type(value_d2) == list:
                tmp=[value]
                tmp.extend(value_d2)
                return_dict[key]=tmp
            elif type(value) != list and type(value_d2) != list:
                return_dict[key]=[value]   [value_d2]
        else:
            return_dict[key]=value
    for key, value in d2.items():
        if key not in return_dict:
            return_dict[key]=value
    return return_dict

(the last elif should be an else, but I thought it was more readable this way)

Edit:

Instead of preserving all values, is it possible to preserve all keys, but remove duplicate values? I.e.

d1={'a':'a','b':'b','c':'c'}
d2={'b':'b','c':['d','f','g'],'e':'e'}

> print(d1.extend(d2))

{'a':'a','b':'b','c':['c','d','f','g'],'e':'e'}

CodePudding user response:

Use a collections.defaultdict as temporary storage, like shown below:

from collections import defaultdict

d1 = {'a': 'a', 'b': 'b', 'c': 'c'}
d2 = {'b': 'a', 'c': ['d', 'f', 'g'], 'e': 'e'}

tmp = defaultdict(list)

for d in [d1, d2]:
    for k, v in d.items():
        if isinstance(v, list):
            tmp[k].extend(v)
        else:
            tmp[k].append(v)


res = { k : v if len(v) > 1 else v[0] for k, v in tmp.items()}
print(res)

Output

{'a': 'a', 'b': ['b', 'a'], 'c': ['c', 'd', 'f', 'g'], 'e': 'e'}

An alternative, also using defaultdict, is to do:

tmp1 = defaultdict(list)
tmp2 = defaultdict(list)

tmp1.update(d1)
tmp2.update(d2)

tmp = {key: [*tmp1[key], *tmp2[key]] for key in tmp1.keys() | tmp2.keys()}
res = {k: v if len(v) > 1 else v[0] for k, v in tmp.items()}
print(res)

Both approaches work for Python 3.7.

UPDATE

As mentioned by @ShadowRanger you could use a set, instead of list:

tmp1 = defaultdict(set)
tmp2 = defaultdict(set)

tmp1.update(d1)
tmp2.update(d2)

tmp = {key: [*tmp1[key], *tmp2[key]] for key in tmp1.keys() | tmp2.keys()}
res = {k: v if len(v) > 1 else v[0] for k, v in tmp.items()}
print(res)

CodePudding user response:

You can use a helper function safe_combine in combination with dict.union operator | available in Python 3.8 :

from __future__ import annotations


d1 = {'a': 'a', 'b': 'b', 'c': 'c'}
d2 = {'b': 'a', 'c': ['d', 'f', 'g'], 'e': 'e'}


def safe_combine(o1: str | list, o2: str | list):
    return (o1 if isinstance(o1, list) else [o1]) \
             (o2 if isinstance(o2, list) else [o2])


merged = {k: safe_combine(d1[k], d2[k]) if k in d1 and k in d2 else v
          for k, v in (d1 | d2).items()}

print(merged)

Out:

{'a': 'a', 'b': ['b', 'a'], 'c': ['c', 'd', 'f', 'g'], 'e': 'e'}

NB: For Python versions earlier than 3.8, you can use the {**d1, **d2} syntax instead of (d1 | d2).

  • Related