Home > Blockchain >  Map dicts of different length together for consistent formatting
Map dicts of different length together for consistent formatting

Time:09-26

I have some dictionaries which I want to map to equal format:

PARTS = [{"part": 23, "subparts": [{"subpart": 1, "image": "first"},
                                   {"subpart": 3, "image": "third"}, 
                                   {"subpart": 5, "image": "fifth"}]}, 
         {"part": 12, "subparts": [{"subpart": 4, "image": "FOURTH"}, 
                                   {"subpart": 3, "image": "THIRD"}, 
                                   {"subpart": 5, "image": "FIFTH"}]} ..]

I want to reorder the dict in a way that it looks like this:

PARTS_REORDERD = {"part": 23,
                  "subpart_1": {"image": "first"},
                  "subpart_2": {"image": None},
                  "subpart_3": {"image": "third"},
                  "subpart_4": {"image": None},
                  "subpart_5": {"image": "fifth"},  
                 },
                 {  
                  "part": 12,
                  "subpart_1": {"image": None},
                  "subpart_2": {"image": None},
                  "subpart_3": {"image": "THIRD"},
                  "subpart_4": {"image": "FOURTH"},
                  "subpart_5": {"image": "FIFTH"},  
                 }

I have a working approach, which I dislike a lot because I think it is not very readable after not looking at the code for some time:

for i in PARTS:
    PART = {}
    PART["part"] = i["part"]
    SUBPARTS = sorted(i["subparts"], key = lambda k: k["subpart"])
    for j in range(1, 6):
        try:
            PART[f"subpart_{j}"] = next({"image": p["image"]} for p in SUBPARTS if p["subpart"] == j)
        except Exception as E:
            PART[f"subpart_{j}"] = {"image": None}

    print(PART)

results in:

{'part': 23, 'subpart_1': {'image': 'first'}, 
             'subpart_2': {'image': None}, 
             'subpart_3': {'image': 'third.'}, 
             'subpart_4': {'image': None}, 
             'subpart_5': {'image': "fifth"}},
{'part': 12, 'subpart_1': {'image': None}, 
             'subpart_2': {'image': None}, 
             'subpart_3': {'image': 'THIRD'}, 
             'subpart_4': {'image': 'FOURTH'},
             'subpart_5': {'image': "FIFTH"}}

I tried a couple of things especially to avoid the next(). But maybe there is an easier to read approach at all?

CodePudding user response:

You can split the creation of output list into 2 steps:

1.) Create temporary helper dictionary where keys are tuples of part/subpart and values are image-values

2.) Make a list comprehension where you'll be using dict.get (the default return value is None)

For example:

tmp = {
    (p["part"], s["subpart"]): s["image"] for p in PARTS for s in p["subparts"]
}

out = [
    {
        "part": p["part"],
        **{
            f"subpart_{i}": {"image": tmp.get((p["part"], i))}
            for i in range(1, 6)
        },
    }
    for p in PARTS
]
print(out)

Prints:

[
    {
        "part": 23,
        "subpart_1": {"image": "first"},
        "subpart_2": {"image": None},
        "subpart_3": {"image": "third"},
        "subpart_4": {"image": None},
        "subpart_5": {"image": "fifth"},
    },
    {
        "part": 12,
        "subpart_1": {"image": None},
        "subpart_2": {"image": None},
        "subpart_3": {"image": "THIRD"},
        "subpart_4": {"image": "FOURTH"},
        "subpart_5": {"image": "FIFTH"},
    },
]

EDIT: Removing some comprehensions:

tmp = {
    (p["part"], s["subpart"]): s["image"] for p in PARTS for s in p["subparts"]
}

out = []
for p in PARTS:
    out.append({"part": p["part"]})
    for i in range(1, 6):
        out[-1][f"subpart_{i}"] = {"image": tmp.get((p["part"], i))}
print(out)

CodePudding user response:

I propose a solution that will take up an arbitrary amount of indices and fill up empty spaces inside:

Additionally, I propose using a factory to create values for empty subparts.

Also big plus here is that we don't sort subparts in any way

PARTS = [
    {
        "part": 23,
        "subparts": [
            {"subpart": 1, "image": "first"},
            {"subpart": 3, "image": "third"},
            {"subpart": 5, "image": "fifth"}]},
    {
        "part": 12,
        "subparts": [
            {"subpart": 4, "image": "FOURTH"},
            {"subpart": 3, "image": "THIRD"},
            {"subpart": 7, "image": "FIFTH"},
            {"subpart": 6, "image": "FIFTH"},
            {"subpart": 10, "image": "FIFTH"}]}
]


def missing_subparts(last_ind, subpart_ind, factory):
    for ind in range(last_ind, subpart_ind):
        print("missing", ind)
        yield f"subpart_{ind}", factory()  # key, value


def prosess_part(dictionary, factory):
    result = {"part": dictionary["part"]}

    last_ind = 1  # first index will be 1
    for subpart in dictionary["subparts"]:
        subpart_ind = subpart.pop("subpart")  # remove the index

        if subpart_ind >= last_ind:  # skip if we already filled this part
            result.update(missing_subparts(last_ind, subpart_ind, factory))
            last_ind = subpart_ind   1  # subpart_ind gonna be filled with subpart

        result[f"subpart_{subpart_ind}"] = subpart  # use index-less dict
        print("not missing", subpart_ind)

    return result


for part in PARTS:
    part = prosess_part(part, lambda: {"image": None})

    print("\n{")
    for name, value in part.items():
        print(f"    {name!r}: {value}")
    print("}\n")

Outputs:

not missing 1
missing 2
not missing 3
missing 4
not missing 5

{
    'part': 23
    'subpart_1': {'image': 'first'}
    'subpart_2': {'image': None}
    'subpart_3': {'image': 'third'}
    'subpart_4': {'image': None}
    'subpart_5': {'image': 'fifth'}
}

missing 1
missing 2
missing 3
not missing 4
not missing 3
missing 5
missing 6
not missing 7
not missing 6
missing 8
missing 9
not missing 10

{
    'part': 12
    'subpart_1': {'image': None}
    'subpart_2': {'image': None}
    'subpart_3': {'image': 'THIRD'}
    'subpart_4': {'image': 'FOURTH'}
    'subpart_5': {'image': None}
    'subpart_6': {'image': 'FIFTH'}
    'subpart_7': {'image': 'FIFTH'}
    'subpart_8': {'image': None}
    'subpart_9': {'image': None}
    'subpart_10': {'image': 'FIFTH'}
}

If we always want to have at least 5 subparts (or only 5 parts, if the original data will not have more) we can iterate through parts and append 5th subparts, if needed, or even initialize paths with 5th part present.

  • Related