Home > database >  Python 3 - dynamically nest dicts into lists given unknown number of categories
Python 3 - dynamically nest dicts into lists given unknown number of categories

Time:11-11

Variable tsv_data has the following structure:

[
{'id':1,'name':'bob','type':'blue','size':2},
{'id':2,'name':'bob','type':'blue','size':3},
{'id':3,'name':'bob','type':'blue','size':4},
{'id':4,'name':'bob','type':'red','size':2},
{'id':5,'name':'sarah','type':'blue','size':2},
{'id':6,'name':'sarah','type':'blue','size':3},
{'id':7,'name':'sarah','type':'green','size':2},
{'id':8,'name':'jack','type':'blue','size':5},
]

Which I would like to restructure into:

[
{'name':'bob', 'children':[
    {'name':'blue','children':[
        {'id':1, 'size':2},
        {'id':2, 'size':3},
        {'id':3, 'size':4}
    ]},
    {'name':'red','children':[
        {'id':4, 'size':2}
    ]}
  ]},
{'name':'sarah', 'children':[
    {'name':'blue','children':[
        {'id':5, 'size':2},
        {'id':6, 'size':3},
    ]},
    {'name':'green','children':[
        {'id':7, 'size':2}
    ]}
  ]},
{'name':'jack', 'children':[
    {'name':'blue', 'children':[
        {'id':8, 'size':5}
    ]}
  ]}
]

What is obstructing my progress is not knowing how many items will be in the children list for each major category. In a similar vein, we also don't know which categories will be present. It could be blue or green or red -- all three or in any combination (like only red and green or only green).

Question

How might we devise a fool-proof way to compile the basic list of list contained in tsv_data into a multi-tier hierarchical data structure as above?

CodePudding user response:

Given your major categories as a list:

categories = ['name', 'type']

You can first transform the input data into a nested dict of lists so that it's easier and more efficient to access children by keys than your desired output format, a nested list of dicts:

tree = {}
for record in tsv_data:
    node = tree
    for category in categories[:-1]:
        node = node.setdefault(record.pop(category), {})
    node.setdefault(record.pop(categories[-1]), []).append(record)

tree would become:

{'bob': {'blue': [{'id': 1, 'size': 2}, {'id': 2, 'size': 3}, {'id': 3, 'size': 4}], 'red': [{'id': 4, 'size': 2}]}, 'sarah': {'blue': [{'id': 5, 'size': 2}, {'id': 6, 'size': 3}], 'green': [{'id': 7, 'size': 2}]}, 'jack': {'blue': [{'id': 8, 'size': 5}]}}

You can then transform the nested dict to your desired output structure with a recursive function:

def transform(node):
    if isinstance(node, dict):
        return [
            {'name': name, 'children': transform(child)}
            for name, child in node.items()
        ]
    return node

so that transform(tree) would return:

[{'name': 'bob', 'children': [{'name': 'blue', 'children': [{'id': 1, 'size': 2}, {'id': 2, 'size': 3}, {'id': 3, 'size': 4}]}, {'name': 'red', 'children': [{'id': 4, 'size': 2}]}]}, {'name': 'sarah', 'children': [{'name': 'blue', 'children': [{'id': 5, 'size': 2}, {'id': 6, 'size': 3}]}, {'name': 'green', 'children': [{'id': 7, 'size': 2}]}]}, {'name': 'jack', 'children': [{'name': 'blue', 'children': [{'id': 8, 'size': 5}]}]}]

Demo: https://replit.com/@blhsing/NotableCourageousTranslations

  • Related