get permutations of dict nested with dicts-CodePudding

Really struggling to find a good solution for this problem.

assume I have the following dict:

items = {
    "item_name": {
        "apple": {"color": ["red", "blue"]},
        "banana": {"color": ["yellow", "red"]}
        },
    "type": ["ripe", "not_ripe"]
}

and I want to generate the following output:

[
    {"item_name": "apple", "colors": "red", "type": "ripe"},
    {"item_name": "apple", "colors": "blue", "type": "ripe"},
    {"item_name": "apple", "colors": "red", "type": "not_ripe"},
    {"item_name": "apple", "colors": "blue", "type": "not_ripe"},
    {"item_name": "banana", "colors": "yellow", "type": "ripe"},
    {"item_name": "banana", "colors": "red", "type": "ripe"},
    {"item_name": "banana", "colors": "yellow", "type": "not_ripe"},
    {"item_name": "banana", "colors": "red", "type": "not_ripe"},
]

So I want the cartesion product of item_name x color x type, but the possible values for color are different for each item_name (possibly overlapping), so it is not sufficient to just permute all the dict keys with itertools.product as described e.g. in this question

This is different from the problems I have found on SO, as far as I see it there are question on resolving nested dicts, but only if the sub-elements are lists (not dicts, as it is the case in my question), for example here or here.

I would really like to show anything that i have tried but so far I canceled each approach for being too complex.

Is there a straightforward to achieve the desired result? It would also be an option to change the structure of items in a way that the logic is preserved.

CodePudding user response：

Since each item has unique properties (i.e. 'colors'), looping over the items is the intuitive solution:

import itertools

items = {
    "item_name": {
        "apple": {"color": ["red", "blue"]},
        "banana": {"color": ["yellow", "red"]}
        },
    "type": ["ripe", "not_ripe"]
}

items_product = []
for item, properties in items["item_name"].items():
    items_product.extend(
        [dict(zip(("type", *properties.keys(), "item_name"), (*values, item)))
         for values in itertools.product(items["type"], *properties.values())])

print(items_product)

Horrendous one-liner is also an option:

items_product = [
    comb for item, properties in items["item_name"].items()
    for comb in (
        dict(
            zip(("item_name", "type", *properties.keys()),
                (item, *values))
        )
         for values in itertools.product(items["type"], *properties.values())
    )
]

I hope you understand this is a fertile soil for having nightmares.

CodePudding user response：

Use this. I defined a recursive function to get all the combinations from a list of buckets:

def bucket(lst, depth=0):
    for item in lst[0]:
        if len(lst) > 1:
            for result in bucket(lst[1:], depth 1):
                yield [item]   result
        else:
            yield [item]

items = {
    "item_name": {
        "apple": {"color": ["red", "blue"]},
        "banana": {"color": ["yellow", "red"]}
        },
    "type": ["ripe", "not_ripe"]
}

lst = []
for item, colours in items['item_name'].items():
    combinations = list(bucket([colours['color'], items['type']]))
    for colour, ripeness in combinations:
        lst.append({'item_name': item, 'color': colour, 'type': ripeness})

print(lst)

Output:

[{'color': 'red', 'item_name': 'apple', 'type': 'ripe'},
 {'color': 'red', 'item_name': 'apple', 'type': 'not_ripe'},
 {'color': 'blue', 'item_name': 'apple', 'type': 'ripe'},
 {'color': 'blue', 'item_name': 'apple', 'type': 'not_ripe'},
 {'color': 'yellow', 'item_name': 'banana', 'type': 'ripe'},
 {'color': 'yellow', 'item_name': 'banana', 'type': 'not_ripe'},
 {'color': 'red', 'item_name': 'banana', 'type': 'ripe'},
 {'color': 'red', 'item_name': 'banana', 'type': 'not_ripe'}]

CodePudding user response：

This will work

If you want it to look really bad and unreadable you could do this:

print([{"item_name": name, "colors": color, "type": typ} for name in list(items["item_name"].keys()) for typ in items["type"] for color in items["item_name"][name]["color"]])

Which is essentially the same as this:

def get_combos(items):
    names = list(items["item_name"].keys())
    types = items["type"]
    current = []
    for name in names:
        for typ in types:
            for color in items["item_name"][name]["color"]:
                current.append({"item_name": name, "colors": color, "type": typ})
    return current

print(get_combos(items))

OUTPUT

[
  {'item_name': 'apple', 'colors': 'red', 'type': 'ripe'}, 
  {'item_name': 'apple', 'colors': 'blue', 'type': 'ripe'}, 
  {'item_name': 'apple', 'colors': 'red', 'type': 'not_ripe'}, 
  {'item_name': 'apple', 'colors': 'blue', 'type': 'not_ripe'}, 
  {'item_name': 'banana', 'colors': 'yellow', 'type': 'ripe'}, 
  {'item_name': 'banana', 'colors': 'red', 'type': 'ripe'}, 
  {'item_name': 'banana', 'colors': 'yellow', 'type': 'not_ripe'}, 
  {'item_name': 'banana', 'colors': 'red', 'type': 'not_ripe'}
]

CodePudding user response：

First, we define the structure of the dictionary you provide and let it is T. We can find that the value of T can be a list or a dictionary, but the value type of the dictionary must be T:

T = dict[str, dict[str, 'T'] | list[str]]

After the recursive definition of the type is clear, we can easily (maybe not, I'll explain it in my spare time.) use the recursive function to solve it:

from functools import reduce
from operator import ior
from itertools import product


def flat(mapping: T) -> list[dict[str, str]]:
    collection = [[{k: vk} | mp for vk, vv in v.items() for mp in flat(vv)]
                  if isinstance(v, dict)
                  else [{k: elem} for elem in v]
                  for k, v in mapping.items()]
    return [reduce(ior, mappings, {}) for mappings in product(*collection)]

Test:

>>> items = {
...     "item_name": {
...         "apple": {"color": ["red", "blue"]},
...         "banana": {"color": ["yellow", "red"]}
...         },
...     "type": ["ripe", "not_ripe"]
... }
>>> from pprint import pp
>>> pp(flat(items))
[{'item_name': 'apple', 'color': 'red', 'type': 'ripe'},
 {'item_name': 'apple', 'color': 'red', 'type': 'not_ripe'},
 {'item_name': 'apple', 'color': 'blue', 'type': 'ripe'},
 {'item_name': 'apple', 'color': 'blue', 'type': 'not_ripe'},
 {'item_name': 'banana', 'color': 'yellow', 'type': 'ripe'},
 {'item_name': 'banana', 'color': 'yellow', 'type': 'not_ripe'},
 {'item_name': 'banana', 'color': 'red', 'type': 'ripe'},
 {'item_name': 'banana', 'color': 'red', 'type': 'not_ripe'}]