Map dicts of different length together for consistent formatting-CodePudding

I have some dictionaries which I want to map to equal format:

PARTS = [{"part": 23, "subparts": [{"subpart": 1, "image": "first"},
                                   {"subpart": 3, "image": "third"}, 
                                   {"subpart": 5, "image": "fifth"}]}, 
         {"part": 12, "subparts": [{"subpart": 4, "image": "FOURTH"}, 
                                   {"subpart": 3, "image": "THIRD"}, 
                                   {"subpart": 5, "image": "FIFTH"}]} ..]

I want to reorder the dict in a way that it looks like this:

PARTS_REORDERD = {"part": 23,
                  "subpart_1": {"image": "first"},
                  "subpart_2": {"image": None},
                  "subpart_3": {"image": "third"},
                  "subpart_4": {"image": None},
                  "subpart_5": {"image": "fifth"},  
                 },
                 {  
                  "part": 12,
                  "subpart_1": {"image": None},
                  "subpart_2": {"image": None},
                  "subpart_3": {"image": "THIRD"},
                  "subpart_4": {"image": "FOURTH"},
                  "subpart_5": {"image": "FIFTH"},  
                 }

I have a working approach, which I dislike a lot because I think it is not very readable after not looking at the code for some time:

for i in PARTS:
    PART = {}
    PART["part"] = i["part"]
    SUBPARTS = sorted(i["subparts"], key = lambda k: k["subpart"])
    for j in range(1, 6):
        try:
            PART[f"subpart_{j}"] = next({"image": p["image"]} for p in SUBPARTS if p["subpart"] == j)
        except Exception as E:
            PART[f"subpart_{j}"] = {"image": None}

    print(PART)

results in:

{'part': 23, 'subpart_1': {'image': 'first'}, 
             'subpart_2': {'image': None}, 
             'subpart_3': {'image': 'third.'}, 
             'subpart_4': {'image': None}, 
             'subpart_5': {'image': "fifth"}},
{'part': 12, 'subpart_1': {'image': None}, 
             'subpart_2': {'image': None}, 
             'subpart_3': {'image': 'THIRD'}, 
             'subpart_4': {'image': 'FOURTH'},
             'subpart_5': {'image': "FIFTH"}}

I tried a couple of things especially to avoid the next(). But maybe there is an easier to read approach at all?

CodePudding user response：

You can split the creation of output list into 2 steps:

1.) Create temporary helper dictionary where keys are tuples of part/subpart and values are image-values

2.) Make a list comprehension where you'll be using dict.get (the default return value is None)

For example:

tmp = {
    (p["part"], s["subpart"]): s["image"] for p in PARTS for s in p["subparts"]
}

out = [
    {
        "part": p["part"],
        **{
            f"subpart_{i}": {"image": tmp.get((p["part"], i))}
            for i in range(1, 6)
        },
    }
    for p in PARTS
]
print(out)

Prints:

[
    {
        "part": 23,
        "subpart_1": {"image": "first"},
        "subpart_2": {"image": None},
        "subpart_3": {"image": "third"},
        "subpart_4": {"image": None},
        "subpart_5": {"image": "fifth"},
    },
    {
        "part": 12,
        "subpart_1": {"image": None},
        "subpart_2": {"image": None},
        "subpart_3": {"image": "THIRD"},
        "subpart_4": {"image": "FOURTH"},
        "subpart_5": {"image": "FIFTH"},
    },
]

EDIT: Removing some comprehensions:

tmp = {
    (p["part"], s["subpart"]): s["image"] for p in PARTS for s in p["subparts"]
}

out = []
for p in PARTS:
    out.append({"part": p["part"]})
    for i in range(1, 6):
        out[-1][f"subpart_{i}"] = {"image": tmp.get((p["part"], i))}
print(out)

CodePudding user response：

I propose a solution that will take up an arbitrary amount of indices and fill up empty spaces inside:

Additionally, I propose using a factory to create values for empty subparts.

Also big plus here is that we don't sort subparts in any way

PARTS = [
    {
        "part": 23,
        "subparts": [
            {"subpart": 1, "image": "first"},
            {"subpart": 3, "image": "third"},
            {"subpart": 5, "image": "fifth"}]},
    {
        "part": 12,
        "subparts": [
            {"subpart": 4, "image": "FOURTH"},
            {"subpart": 3, "image": "THIRD"},
            {"subpart": 7, "image": "FIFTH"},
            {"subpart": 6, "image": "FIFTH"},
            {"subpart": 10, "image": "FIFTH"}]}
]


def missing_subparts(last_ind, subpart_ind, factory):
    for ind in range(last_ind, subpart_ind):
        print("missing", ind)
        yield f"subpart_{ind}", factory()  # key, value


def prosess_part(dictionary, factory):
    result = {"part": dictionary["part"]}

    last_ind = 1  # first index will be 1
    for subpart in dictionary["subparts"]:
        subpart_ind = subpart.pop("subpart")  # remove the index

        if subpart_ind >= last_ind:  # skip if we already filled this part
            result.update(missing_subparts(last_ind, subpart_ind, factory))
            last_ind = subpart_ind   1  # subpart_ind gonna be filled with subpart

        result[f"subpart_{subpart_ind}"] = subpart  # use index-less dict
        print("not missing", subpart_ind)

    return result


for part in PARTS:
    part = prosess_part(part, lambda: {"image": None})

    print("\n{")
    for name, value in part.items():
        print(f"    {name!r}: {value}")
    print("}\n")

Outputs:

not missing 1
missing 2
not missing 3
missing 4
not missing 5

{
    'part': 23
    'subpart_1': {'image': 'first'}
    'subpart_2': {'image': None}
    'subpart_3': {'image': 'third'}
    'subpart_4': {'image': None}
    'subpart_5': {'image': 'fifth'}
}

missing 1
missing 2
missing 3
not missing 4
not missing 3
missing 5
missing 6
not missing 7
not missing 6
missing 8
missing 9
not missing 10

{
    'part': 12
    'subpart_1': {'image': None}
    'subpart_2': {'image': None}
    'subpart_3': {'image': 'THIRD'}
    'subpart_4': {'image': 'FOURTH'}
    'subpart_5': {'image': None}
    'subpart_6': {'image': 'FIFTH'}
    'subpart_7': {'image': 'FIFTH'}
    'subpart_8': {'image': None}
    'subpart_9': {'image': None}
    'subpart_10': {'image': 'FIFTH'}
}

If we always want to have at least 5 subparts (or only 5 parts, if the original data will not have more) we can iterate through parts and append 5th subparts, if needed, or even initialize paths with 5th part present.