Home > Software design >  Traverse dictionary, find its roots and put the childs into the right place (branches)
Traverse dictionary, find its roots and put the childs into the right place (branches)

Time:12-14

could anyone help me with this problem I'm currently facing?

I'll show first my take and what I've tried to do and then explain what I didn't accomplish to:

def remove_parent_child_keys(dict_to_remove_keys_from):
    for key, value in dict_to_remove_keys_from.copy().items():
        if isinstance(value, dict):
            remove_parent_child_keys(value)
        if key == "parent" or key == "child":
            del dict_to_remove_keys_from[key]
    return dict_to_remove_keys_from
def get_children(tree, node):
    result = {}
    for child in tree.get(node, []):
        child_node = {child["child"]: {"children": get_children(tree, child["child"])}}
        child_copy=remove_parent_child_keys(child)

        #add child_copy as a new value, not as a key
        #get the key of child_node and add the child_copy as a value
        key_of_child_node=list(child_node.keys())[0]
        child_node[key_of_child_node].update(child_copy)
        result.update(child_node)
    return result

if __name__=="__main__":
    list_of_dicts = [{"child": "1", "parent": "2", "new_data": 3, "branch_id":"10"}, {"child": "2", "parent": "5", "test": 3, "branch_id":"10"},
                     {"child": "3", "parent": "2", "branch_id":"10" }, {"child": "4", "parent": "2", "branch_id":"10" },
                     {"child": "5", "parent": None, "branch_id":"10" },
                     {"child": "6", "parent": "3","branch_id":"10" }]

    dicts = {}

    for element in list_of_dicts:
        if element["parent"] is None:
            dicts.setdefault("root", []).append(element)
        else:
            dicts.setdefault(element["parent"], []).append(element)
    print(dicts)
    resulting_dict=get_children(dicts, 'root')
    print(resulting_dict)

so here basically in the main I do have a list of dicts that are child and a few ones that are parents (say for example 2 is a parent of 1,3,4 - 3 is parent of 6) and a master root, which is 5 that hasn't got any parents.

it produces this output once it finishes iterating:

{'2': [{'child': '1', 'parent': '2', 'new_data': 3, 'branch_id': '10'}, {'child': '3', 'parent': '2', 'branch_id': '10'}, {'child': '4', 'parent': '2', 'branch_id': '10'}], '5': [{'child': '2', 'parent': '5', 'test': 3, 'branch_id': '10'}], 'root': [{'child': '5', 'parent': None, 'branch_id': '10'}], '3': [{'child': '6', 'parent': '3', 'branch_id': '10'}]}

and and if we expand root there's the 5.

enter image description here

then I iterate I call get_children function which iterates recursively through all the dicts to create a correct nested dictionary where 5 is the master and has got its children into it (as well as for 2 which is going to have 1,3,4 and so on). So it produces a structure as:

{
   "5":{
      "children":{
         "2":{
            "children":{
               "1":{
                  "children":{
                     
                  },
                  "new_data":3,
                  "branch_id":"10"
               },
               "3":{
                  "children":{
                     "6":{
                        "children":{
                           
                        },
                        "branch_id":"10"
                     }
                  },
                  "branch_id":"10"
               },
               "4":{
                  "children":{
                     
                  },
                  "branch_id":"10"
               }
            },
            "test":3,
            "branch_id":"10"
         }
      },
      "branch_id":"10"
   }
}

where each leaf has got the parameter children and the nodes get added into that list if they do exist.

The problem I'm facing now tho is the fact that I might as well have a branch id. So basically the node id (child id) isn't UNIQUE and I could have multiple children with different branch ids. So basically:

[{
      "child":"5",
      "parent":"None",
      "branch_id":"11"
   },
   {
      "child":"2",
      "parent":"5",
      "test":3,
      "branch_id":"11"
   },
{
      "child":"5",
      "parent":"None",
      "branch_id":"10"
   },
   {
      "child":"2",
      "parent":"5",
      "test":3,
      "branch_id":"10"
   }
]

So, let's make an example for that: our list of dicts becomes:

[
   {
      "child":"5",
      "parent":"None",
      "branch_id":"11"
   },
   {
      "child":"2",
      "parent":"5",
      "test":3,
      "branch_id":"11"
   },
   {
      "child":"1",
      "parent":"2",
      "new_data":3,
      "branch_id":"10"
   },
   {
      "child":"2",
      "parent":"5",
      "test":3,
      "branch_id":"10"
   },
   {
      "child":"3",
      "parent":"2",
      "branch_id":"10"
   },
   {
      "child":"4",
      "parent":"2",
      "branch_id":"10"
   },
   {
      "child":"5",
      "parent":"None",
      "branch_id":"10"
   },
   {
      "child":"6",
      "parent":"3",
      "branch_id":"10"
   }
]

This makes my code unusable and I've been trying to get my head around how I could overcome this issue in the first iteration (where I decide which one is root and which one isn't) because obviously now we are going to have multiple roots, so it's going to be a list now but we also have to check for the branch_id before assigning any children to its tree.

So a quick recap:

  • can one "child" have two different parents? No, it cannot, but we can have nested parents.
  • is it directed or can it go in any directions? in other words: can there be two different items with the same child? Well, no, the child could have the same id, but the thing it's going to differentiate each other is the branch.
  • why is there "new_data" or "test" parameters? what are they? Basically just a way to show that we have to deal with dicts that could include also other info other than, child, parent and branch_id.

(I'm sorry if in my code I'm mis-using the word TREE but I really don't know how to call this structure).

Could you show me a possible solution to the problem I'm facing? If you need any more info I can for sure give it, I'm been fiddling my head around this issue for 2 days straight and I couldn't figure it out by myself... so I know how hard it can be to understand what I'm facing.

Thank you for the help in advance!

EDIT: I think the best way for the output of the script (graph I guess from @slouchart appreciated answer) would be:

[
  {
    "node": "5",
    "branch": "10",
    "children": [
      {
        "node": "2",
        "branch": "10",
        "children": [
          {
            "node": "1",
            "branch": "10",
            "children": []
          },
          {
            "node": "3",
            "branch": "10",
            "children": [
              {
                "node": "6",
                "branch": "10",
                "children": []
              }
            ]
          },
          {
            "node": "4",
            "branch": "10",
            "children": []
          }
        ]
      }
    ]
  },
  {
    "node": "5",
    "branch": "11",
    "children": [
      {
        "node": "2",
        "branch": "11",
        "children": []
      }

    ]
  }
]

So basically it really could be a forest (what I'm trying to represent).enter image description here

CodePudding user response:

Well, that's a graph to me. With the additional constraint that outbound edges are 0-1. The branch_id may as well be seen as a clique's id (see https://en.wikipedia.org/wiki/Clique_(graph_theory)), that is a strict subgraph of adjacent nodes.

I don't quite well understand the issue you're facing but a couple of good advice would be:

  1. write test cases
  2. add a level of abstraction in front of Python's built-in dict using a class

For instance, you can wind up a class with the following methods:

class WeirdGraph:
   def add_root_node(node_id, branch_id):
      ...
   def add_child_node(node_id, parent_id, branch_id):
      ...

This way, you'll able to verify all the invariants of your graph when adding nodes instead of adding nodes without checking and being in need to prune/reorganize the entire graph afterwards (which can fall in some messy complexity class)

Hope this helps.

  • Related