Home > Mobile >  Merging list of dictionaries
Merging list of dictionaries

Time:04-12

I have the following list of dictionaries:

"entities": [
        {
            "length": 6,
            "offset": 0,
            "type": "bold"
        },
        {
            "length": 6,
            "offset": 0,
            "type": "italic"
        },
        {
            "length": 4,
            "offset": 7,
            "type": "italic"
        }
    ],

I would like to know how to use this input to derive the following list of dictionaries:

"entities": [
            {
                "length": 6,
                "offset": 0,
                "type": "bold_italic"
            },
            {
                "length": 4,
                "offset": 7,
                "type": "italic"
            }
        ],

CodePudding user response:

Group each entry by their length and offset into a dictionary, noting the seen types in a list. Then, read off the computed result back into a list, creating a new dictionary for each unique length/offset pair and joining all of the types with underscores:

from collections import defaultdict

data = [
        {
            "length": 6,
            "offset": 0,
            "type": "bold"
        },
        {
            "length": 6,
            "offset": 0,
            "type": "italic"
        },
        {
            "length": 4,
            "offset": 7,
            "type": "italic"
        }
]

entry_types = defaultdict(list)
for item in data:
    key = item['length'], item['offset']
    entry_types[key].append(item['type'])

result = []
for (length, offset), types in entry_types.items():
    result.append(dict(length=length, offset=offset, type='_'.join(types)))

print(result)

This outputs:

[{'length': 6, 'offset': 0, 'type': 'bold_italic'}, {'length': 4, 'offset': 7, 'type': 'italic'}]

CodePudding user response:

I'm interpreting your question as, "how do I combine the types of entities with the same length and offset, separating unique types by underscores?" Given that, the following will do:

Your question looked like entities was a key in a parent dictionary, but for simplicity, I'm treating it as a stand-alone variable.

In [1]: entities = [
   ...:         {
   ...:             "length": 6,
   ...:             "offset": 0,
   ...:             "type": "bold"
   ...:         },
   ...:         {
   ...:             "length": 6,
   ...:             "offset": 0,
   ...:             "type": "italic"
   ...:         },
   ...:         {
   ...:             "length": 4,
   ...:             "offset": 7,
   ...:             "type": "italic"
   ...:         }
   ...:     ]

In [2]: from collections import defaultdict
In [3]: merged = defaultdict(str)

A defaultdict is like a dict, except when a key doesn't exist a default value will be used instead. In this case, I've specified that values are strings, so the default will be "".

We combine entities by creating a tuple of (length, offset) and using it as a key into the temporary structure merged:

In [4]: for e in entities:
   ...:     key = (e["length"], e["offset"])
   ...:     if not e["type"] in merged[key]:
   ...:         if merged[key]: # if the value at merged[key] is not the empty string
   ...:             merged[key]  = "_"
   ...:         merged[key]  = e["type"]
   ...:

In [5]: merged
Out[5]: defaultdict(str, {(6, 0): 'bold_italic', (4, 7): 'italic'})

Now, clear entities and reconstruct it from merged:

In [6]: entities.clear()
In [7]: for (length, offset), type in merged.items():
   ...:     entities.append({"length": length, "offset": offset, "type": type})
   ...:

In [8]: entities
Out[8]:
[{'length': 6, 'offset': 0, 'type': 'bold_italic'},
 {'length': 4, 'offset': 7, 'type': 'italic'}]

The above is output from the ipython REPL, incidentally.

  • Related