Really struggling to find a good solution for this problem.
assume I have the following dict:
items = {
"item_name": {
"apple": {"color": ["red", "blue"]},
"banana": {"color": ["yellow", "red"]}
},
"type": ["ripe", "not_ripe"]
}
and I want to generate the following output:
[
{"item_name": "apple", "colors": "red", "type": "ripe"},
{"item_name": "apple", "colors": "blue", "type": "ripe"},
{"item_name": "apple", "colors": "red", "type": "not_ripe"},
{"item_name": "apple", "colors": "blue", "type": "not_ripe"},
{"item_name": "banana", "colors": "yellow", "type": "ripe"},
{"item_name": "banana", "colors": "red", "type": "ripe"},
{"item_name": "banana", "colors": "yellow", "type": "not_ripe"},
{"item_name": "banana", "colors": "red", "type": "not_ripe"},
]
So I want the cartesion product of item_name
x color
x type
, but the possible values for color
are different for each item_name
(possibly overlapping), so it is not sufficient to just permute all the dict keys with itertools.product
as described e.g. in this question
This is different from the problems I have found on SO, as far as I see it there are question on resolving nested dicts, but only if the sub-elements are lists (not dicts, as it is the case in my question), for example here or here.
I would really like to show anything that i have tried but so far I canceled each approach for being too complex.
Is there a straightforward to achieve the desired result?
It would also be an option to change the structure of items
in a way that the logic is preserved.
CodePudding user response:
Since each item has unique properties (i.e. 'colors'), looping over the items is the intuitive solution:
import itertools
items = {
"item_name": {
"apple": {"color": ["red", "blue"]},
"banana": {"color": ["yellow", "red"]}
},
"type": ["ripe", "not_ripe"]
}
items_product = []
for item, properties in items["item_name"].items():
items_product.extend(
[dict(zip(("type", *properties.keys(), "item_name"), (*values, item)))
for values in itertools.product(items["type"], *properties.values())])
print(items_product)
Horrendous one-liner is also an option:
items_product = [
comb for item, properties in items["item_name"].items()
for comb in (
dict(
zip(("item_name", "type", *properties.keys()),
(item, *values))
)
for values in itertools.product(items["type"], *properties.values())
)
]
I hope you understand this is a fertile soil for having nightmares.
CodePudding user response:
Use this. I defined a recursive function to get all the combinations from a list of buckets:
def bucket(lst, depth=0):
for item in lst[0]:
if len(lst) > 1:
for result in bucket(lst[1:], depth 1):
yield [item] result
else:
yield [item]
items = {
"item_name": {
"apple": {"color": ["red", "blue"]},
"banana": {"color": ["yellow", "red"]}
},
"type": ["ripe", "not_ripe"]
}
lst = []
for item, colours in items['item_name'].items():
combinations = list(bucket([colours['color'], items['type']]))
for colour, ripeness in combinations:
lst.append({'item_name': item, 'color': colour, 'type': ripeness})
print(lst)
Output:
[{'color': 'red', 'item_name': 'apple', 'type': 'ripe'},
{'color': 'red', 'item_name': 'apple', 'type': 'not_ripe'},
{'color': 'blue', 'item_name': 'apple', 'type': 'ripe'},
{'color': 'blue', 'item_name': 'apple', 'type': 'not_ripe'},
{'color': 'yellow', 'item_name': 'banana', 'type': 'ripe'},
{'color': 'yellow', 'item_name': 'banana', 'type': 'not_ripe'},
{'color': 'red', 'item_name': 'banana', 'type': 'ripe'},
{'color': 'red', 'item_name': 'banana', 'type': 'not_ripe'}]
CodePudding user response:
This will work
If you want it to look really bad and unreadable you could do this:
print([{"item_name": name, "colors": color, "type": typ} for name in list(items["item_name"].keys()) for typ in items["type"] for color in items["item_name"][name]["color"]])
Which is essentially the same as this:
def get_combos(items):
names = list(items["item_name"].keys())
types = items["type"]
current = []
for name in names:
for typ in types:
for color in items["item_name"][name]["color"]:
current.append({"item_name": name, "colors": color, "type": typ})
return current
print(get_combos(items))
OUTPUT
[
{'item_name': 'apple', 'colors': 'red', 'type': 'ripe'},
{'item_name': 'apple', 'colors': 'blue', 'type': 'ripe'},
{'item_name': 'apple', 'colors': 'red', 'type': 'not_ripe'},
{'item_name': 'apple', 'colors': 'blue', 'type': 'not_ripe'},
{'item_name': 'banana', 'colors': 'yellow', 'type': 'ripe'},
{'item_name': 'banana', 'colors': 'red', 'type': 'ripe'},
{'item_name': 'banana', 'colors': 'yellow', 'type': 'not_ripe'},
{'item_name': 'banana', 'colors': 'red', 'type': 'not_ripe'}
]
CodePudding user response:
First, we define the structure of the dictionary you provide and let it is T
. We can find that the value of T
can be a list or a dictionary, but the value type of the dictionary must be T
:
T = dict[str, dict[str, 'T'] | list[str]]
After the recursive definition of the type is clear, we can easily (maybe not, I'll explain it in my spare time.) use the recursive function to solve it:
from functools import reduce
from operator import ior
from itertools import product
def flat(mapping: T) -> list[dict[str, str]]:
collection = [[{k: vk} | mp for vk, vv in v.items() for mp in flat(vv)]
if isinstance(v, dict)
else [{k: elem} for elem in v]
for k, v in mapping.items()]
return [reduce(ior, mappings, {}) for mappings in product(*collection)]
Test:
>>> items = {
... "item_name": {
... "apple": {"color": ["red", "blue"]},
... "banana": {"color": ["yellow", "red"]}
... },
... "type": ["ripe", "not_ripe"]
... }
>>> from pprint import pp
>>> pp(flat(items))
[{'item_name': 'apple', 'color': 'red', 'type': 'ripe'},
{'item_name': 'apple', 'color': 'red', 'type': 'not_ripe'},
{'item_name': 'apple', 'color': 'blue', 'type': 'ripe'},
{'item_name': 'apple', 'color': 'blue', 'type': 'not_ripe'},
{'item_name': 'banana', 'color': 'yellow', 'type': 'ripe'},
{'item_name': 'banana', 'color': 'yellow', 'type': 'not_ripe'},
{'item_name': 'banana', 'color': 'red', 'type': 'ripe'},
{'item_name': 'banana', 'color': 'red', 'type': 'not_ripe'}]