I am trying to create the full hierarchy chain for every child. I have two dicts, the first contains all child-parent pairs to be used as lookup, and the second contains all children as keys, and their values starts off as a list containing the immediate parent where then the grand parents, great garnd parents etc will be appended to. For simplicity sake, a child cannot have more than one parent, but a parent can have multiple children.
c_p = { "A":"C","B":"C","C":"F","D":"E","E":"F","F":""}
hierarchy = {
"A": ["C"],
"B": ["C"],
"C": ["F"],
"D": ["E"],
"E": ["F"],
"F": [""]
}
expected_result = {
"A": ["C", "F"],
"B": ["C", "F"],
"C": ["F"],
"D": ["E", "F"],
"E": ["F"],
"F": [""],
}
Below is the function I have thus far, but it is returning 'str' object has no attribute 'copy'. I know this is because the function is expecting a dict but it is passing through a string but I am not clear on how to structure this function.
def hierarchy_gen(data):
for k, v in data.copy().items():
last_parent = v[-1]
if last_parent not in ['F','',None]:
v = hierarchy_gen(c_p[last_parent])
else:
v = ''
continue
return data
test = hierarchy_gen(hierarchy)
CodePudding user response:
Some terminology. We're doing a transitive closure over a forest (a data structure containing the roots of many trees). By denoting the structure as a forest, I'm assuming there are no cycles in the graph.
def transitive_closure(forest):
def recurse(root, path):
if root in forest and forest[root]:
for x in forest[root]:
path.append(root)
recurse(x, path)
path.pop()
elif path:
result[path[0]] = path[1:]
result = {}
for root in forest:
recurse(root, [])
return result
if __name__ == "__main__":
hierarchy = {
"A": ["C"],
"B": ["C"],
"C": ["F"],
"D": ["E"],
"E": ["F"],
"F": [""]
}
print(transitive_closure(hierarchy))
Now, there's some inconsistency in the expected output. Either ""
counts as a node value or it doesn't. If we count it, you can use
result[path[0]] = path[1:] [root]
but then you'll see ""
at the end of every closure path because "F": [""]
is the terminal point of everything in this particular forest. Or maybe you meant to use "F": []
to indicate that "F"
has no children, in which case you'll still want to append the current node to the path on the base case. You could also handle this as a special value, but that strikes me as a poor design.
Additionally, c_p
seems totally pointless since hierarchy
contains the exact same information, with lists as values. I assume it's possible that some elements have more than a single child, otherwise it's a pretty dull data structure. As it stands, the example is basically a linked list, a degenerate tree.