Home > database >  Bug in recursive dict-merging function
Bug in recursive dict-merging function

Time:07-02

I want to merge inner dictionaries under a specific key (let's say 'x') into the outer dictionary. Lists, dicts and tuples should be traversed recursively.

For example:

>>> key_upmerge({'x': {1: 2, 3: 4}}, 'x')
{1: 2, 3: 4}
>>> key_upmerge({'x': {1: 2, 'x': {3: 4}}}, 'x')
{1: 2, 3: 4}
>>> key_upmerge({'x': {1: 2}, 2: 3}, 'x')
{1: 2, 2: 3}
>>> key_upmerge({'x': {1: 2, 'x': {1: 3}}}, 'x')
[...]
ValueError (clashing keys)
>>> key_upmerge([{'x': {1: 2}}, {'x': {3: 4}}], 'x')
[{1: 2}, {3: 4}]

My function already works for all the test cases above.

def key_upmerge(d, key):
    if isinstance(d, (list, tuple)):
        result = type(d)(key_upmerge(x, key) for x in d)
    elif isinstance(d, dict):
        result = {}

        for k, v in d.items():
            # NOTE: only need to check type of v. k cannot contain dicts
            if isinstance(v, (dict, list, tuple)):
                v = key_upmerge(v, key)
            if k == key and isinstance(v, dict):
                if both := (result.keys() & v.keys()):
                    msg = f'Cannot merge dict into upper dict: clashing keys {both}'
                    raise ValueError(msg)
                result.update(v)
            else:
                result[k] = v
    else:
        raise TypeError('expected dict, list or tuple')

    return result

The problem is that I also expected a ValueError for the following test cases, but the function returns a result.

>>> key_upmerge({'x': {1: 2}, 1: 3}, 'x')
{1: 3} # expected ValueError due to clashing keys
>>> key_upmerge({'x': {1: 2}, 1: {'x': {3: 4}}}, 'x')
{1: {3: 4}} # expected ValueError due to clashing keys

CodePudding user response:

You build the result dict in order. You only check for key clashes when merging in a dict, not when assigning individual values to a key. Since your test cases always have the merged dict appear first ('x': {1, 2} precedes 1: 3), the clashing key doesn't exist in the result yet, and no clash is detected. Your test works only when the clashing individual key precedes the dict to be merged (key_upmerge({1: 3, 'x': {1: 2}}, 'x') dies as you expect).

To catch cases where a clashing individual key-value pair is inserted after a merge, replace:

        else:
            result[k] = v

with:

        elif k in result:
            raise ValueError(f'clashing key {k!r}')
        else:
            result[k] = v

verifying that k is not in result when you insert individual key-value pairs, not just when you perform bulk merges.

  • Related