Home > Back-end >  python recursion for child parent grandparent etc hierarchy
python recursion for child parent grandparent etc hierarchy

Time:11-30

I am trying to create the full hierarchy chain for every child. I have two dicts, the first contains all child-parent pairs to be used as lookup, and the second contains all children as keys, and their values starts off as a list containing the immediate parent where then the grand parents, great garnd parents etc will be appended to. For simplicity sake, a child cannot have more than one parent, but a parent can have multiple children.

c_p = { "A":"C","B":"C","C":"F","D":"E","E":"F","F":""}
hierarchy = {
    "A": ["C"],
    "B": ["C"],
    "C": ["F"],
    "D": ["E"],
    "E": ["F"],
    "F": [""]
}
expected_result = {
    "A": ["C", "F"],
    "B": ["C", "F"],
    "C": ["F"],
    "D": ["E", "F"],
    "E": ["F"],
    "F": [""],
}

Below is the function I have thus far, but it is returning 'str' object has no attribute 'copy'. I know this is because the function is expecting a dict but it is passing through a string but I am not clear on how to structure this function.

def hierarchy_gen(data):
    for k, v in data.copy().items():
        last_parent = v[-1]
        if last_parent not in ['F','',None]: 
            v  = hierarchy_gen(c_p[last_parent])
        else:
            v  = ''
            continue
    return data
test = hierarchy_gen(hierarchy)

CodePudding user response:

Some terminology. We're doing a transitive closure over a forest (a data structure containing the roots of many trees). By denoting the structure as a forest, I'm assuming there are no cycles in the graph.

def transitive_closure(forest):
    def recurse(root, path):
        if root in forest and forest[root]:
            for x in forest[root]:
                path.append(root)
                recurse(x, path)
                path.pop()
        elif path:
            result[path[0]] = path[1:]

    result = {}

    for root in forest:
        recurse(root, [])

    return result


if __name__ == "__main__":
    hierarchy = {
        "A": ["C"],
        "B": ["C"],
        "C": ["F"],
        "D": ["E"],
        "E": ["F"],
        "F": [""]
    }
    print(transitive_closure(hierarchy))

Now, there's some inconsistency in the expected output. Either "" counts as a node value or it doesn't. If we count it, you can use

result[path[0]] = path[1:]   [root]

but then you'll see "" at the end of every closure path because "F": [""] is the terminal point of everything in this particular forest. Or maybe you meant to use "F": [] to indicate that "F" has no children, in which case you'll still want to append the current node to the path on the base case. You could also handle this as a special value, but that strikes me as a poor design.

Additionally, c_p seems totally pointless since hierarchy contains the exact same information, with lists as values. I assume it's possible that some elements have more than a single child, otherwise it's a pretty dull data structure. As it stands, the example is basically a linked list, a degenerate tree.

  • Related