I am looking for a function that will take a nested dictionary, and produce the combinations / product of the values.
My query is similar to the problem specified here, but I can't seem to adapt the answers their to fit my needs: Cartesian product of nested dictionaries of lists
I wish to have an input like this:
d = {
"country": [1, 2],
"health_state": [
{"healthy": [1]},
{"breast cancer": {"stage": [1, 2]}}
]
}
Produce an output as follows:
[
{{"country":1},{"health state":{"healthy":1}}},
{{"country":2},{"health state":{"healthy":1}}},
{{"country":1},{"health state":{"breast cancer":{"stage":1}}}},
{{"country":1},{"health state":{"breast cancer":{"stage":2}}}},
{{"country":2},{"health state":{"breast cancer":{"stage":1}}}},
{{"country":2},{"health state":{"breast cancer":{"stage":2}}}}
]
In this example, the output is a list of 'states' that a person can occupy
- any two elements of a list (input) should not be in the same element of the returned list (output), e.g. someone cannot be in country 1 and country 2 simultaneously
- all keys in a dictionary (input) should be returned in the same element of the list (output), e.g. someone is in country 1 and also in a health_state. If that health state is 'breast cancer' they are also in stage 1 or stage 2
I can envision a solution that requires lots of for loops, checking whether elements are dictionaries, lists or neither, but this seems inefficient, especially for deeply nested dictionaries. I suspect there is a more elegant solution using itertools.product
and recursion perhaps?
CodePudding user response:
You can use recursion with itertools.product
:
import itertools as it
d = {'country': [1, 2], 'health_state': [{'healthy': [1]}, {'breast cancer': {'stage': [1, 2]}}]}
def c_prod(d):
if isinstance(d, list):
for i in d:
yield from ([i] if not isinstance(i, (dict, list)) else c_prod(i))
else:
for i in it.product(*map(c_prod, d.values())):
yield dict(zip(d.keys(), i))
print(list(c_prod(d)))
Output:
[{'country': 1, 'health_state': {'healthy': 1}},
{'country': 1, 'health_state': {'breast cancer': {'stage': 1}}},
{'country': 1, 'health_state': {'breast cancer': {'stage': 2}}},
{'country': 2, 'health_state': {'healthy': 1}},
{'country': 2, 'health_state': {'breast cancer': {'stage': 1}}},
{'country': 2, 'health_state': {'breast cancer': {'stage': 2}}}]
The output from the code above produces a list of dictionaries, but your desired output mirrors a list of list of dictionaries (list[list[dict]]
), thus, a final transformation can be made:
r = [[{j:k} for j, k in i.items()] for i in c_prod(d)]
Output:
[[{'country': 1}, {'health_state': {'healthy': 1}}], [{'country': 1}, {'health_state': {'breast cancer': {'stage': 1}}}], [{'country': 1}, {'health_state': {'breast cancer': {'stage': 2}}}], [{'country': 2}, {'health_state': {'healthy': 1}}], [{'country': 2}, {'health_state': {'breast cancer': {'stage': 1}}}], [{'country': 2}, {'health_state': {'breast cancer': {'stage': 2}}}]]