Home > OS >  Combination of nested dictionaries with arbitrary lengths in python
Combination of nested dictionaries with arbitrary lengths in python

Time:12-01

I am looking for a function that will take a nested dictionary, and produce the combinations / product of the values.

My query is similar to the problem specified here, but I can't seem to adapt the answers their to fit my needs: Cartesian product of nested dictionaries of lists

I wish to have an input like this:

d = {
  "country": [1, 2],
  "health_state": [
    {"healthy": [1]},
    {"breast cancer": {"stage": [1, 2]}}
  ]
}

Produce an output as follows:

[
{{"country":1},{"health state":{"healthy":1}}},
{{"country":2},{"health state":{"healthy":1}}},
{{"country":1},{"health state":{"breast cancer":{"stage":1}}}},
{{"country":1},{"health state":{"breast cancer":{"stage":2}}}},
{{"country":2},{"health state":{"breast cancer":{"stage":1}}}},
{{"country":2},{"health state":{"breast cancer":{"stage":2}}}}
]

In this example, the output is a list of 'states' that a person can occupy

  • any two elements of a list (input) should not be in the same element of the returned list (output), e.g. someone cannot be in country 1 and country 2 simultaneously
  • all keys in a dictionary (input) should be returned in the same element of the list (output), e.g. someone is in country 1 and also in a health_state. If that health state is 'breast cancer' they are also in stage 1 or stage 2

I can envision a solution that requires lots of for loops, checking whether elements are dictionaries, lists or neither, but this seems inefficient, especially for deeply nested dictionaries. I suspect there is a more elegant solution using itertools.product and recursion perhaps?

CodePudding user response:

You can use recursion with itertools.product:

import itertools as it
d = {'country': [1, 2], 'health_state': [{'healthy': [1]}, {'breast cancer': {'stage': [1, 2]}}]}
def c_prod(d):
  if isinstance(d, list):
     for i in d:
        yield from ([i] if not isinstance(i, (dict, list)) else c_prod(i))
  else:
     for i in it.product(*map(c_prod, d.values())):
        yield dict(zip(d.keys(), i))

print(list(c_prod(d)))

Output:

[{'country': 1, 'health_state': {'healthy': 1}}, 
 {'country': 1, 'health_state': {'breast cancer': {'stage': 1}}}, 
 {'country': 1, 'health_state': {'breast cancer': {'stage': 2}}}, 
 {'country': 2, 'health_state': {'healthy': 1}}, 
 {'country': 2, 'health_state': {'breast cancer': {'stage': 1}}}, 
 {'country': 2, 'health_state': {'breast cancer': {'stage': 2}}}]

The output from the code above produces a list of dictionaries, but your desired output mirrors a list of list of dictionaries (list[list[dict]]), thus, a final transformation can be made:

r = [[{j:k} for j, k in i.items()] for i in c_prod(d)]

Output:

[[{'country': 1}, {'health_state': {'healthy': 1}}], [{'country': 1}, {'health_state': {'breast cancer': {'stage': 1}}}], [{'country': 1}, {'health_state': {'breast cancer': {'stage': 2}}}], [{'country': 2}, {'health_state': {'healthy': 1}}], [{'country': 2}, {'health_state': {'breast cancer': {'stage': 1}}}], [{'country': 2}, {'health_state': {'breast cancer': {'stage': 2}}}]]
  • Related