Top level length of nested dictionaries of lists with one function-CodePudding

I have a nested dictionary containing lists and need to get the total sum of the lengths of all the lists in each top-level value in the dictionary. An example list is the following:

data = {0: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
        1: {0: [11, 12],
            1: {0: [13, 14],
                1: [15, 16],
                 2: [17, 18, 19, 20],
                 3: {0: [43, 34], 1: [4], 2: [2, 3, 4, 5]}}}}

I believe I have a working solution using the following two functions:

def TreeSum(tree):
    return [len(v) if isinstance(v, list) else SubTreeSum(v) for v in tree.values()]

def SubTreeSum(tree):
    return sum([len(v) if isinstance(v, list) else SubTreeSum(v) for v in tree.values()])

The output of TreeSum(data) is [10, 17] and from TreeSum(data[1][1]) it is [2, 2, 4, 7].

I have tried to combine the functions into a single function, mainly for aesthetical reasons, but I have not figured out a solution. Can anybody suggest a solution to that problem?

CodePudding user response：

All you need is this:

def TreeSum(tree):
    return [len(v) if isinstance(v, list) else sum(TreeSum(v)) for v in tree.values()]

CodePudding user response：

From what I gather you have a recursive function that needs to operate slightly differently at the top level. For that one approach that is arguably cleaner is to use a parameter flag that indicates if you're at the top level like this:

def TreeSum(tree, root=True):
    a = [len(v) if isinstance(v, list) else TreeSum(v, False) for v in tree.values()]
    return a if root else sum(a)

CodePudding user response：

You can differentiate the case where you run the top-level function and the recursive calls:

def TreeSum(tree, agg=False):
    l =  [len(v) if isinstance(v, list) else TreeSum(v, True) for v in tree.values()]
    return sum(l) if agg else l

output:

>>> TreeSum(data)
[10, 17]

CodePudding user response：

A simple solution without passing behavior-changing flags is to just make SubTreeSum and nested function of TreeSum. Voila, single function:

def TreeSum(tree):
    def SubTreeSum(tree):
        return sum([len(v) if isinstance(v, list) else SubTreeSum(v) for v in tree.values()])

    return [len(v) if isinstance(v, list) else SubTreeSum(v) for v in tree.values()]

Maybe prettier would be to strip out the common functionality into another nested function to avoid code duplication:

def TreeSum(tree):
    def SubTreeList(tree):
        return [len(v) if isinstance(v, list) else SubTreeSum(v) for v in tree.values()]

    def SubTreeSum(tree):
        return sum(SubTreeList(tree))

    return SubTreeList(tree)

CodePudding user response：

You can do it in a single recursive function by using the dictionary keys or list values as the iteration driver and counting 1 (for list items) or the recursive sum of the length list (for dictionaries):

def topCounts(C):
    return [1 if type(C) is list else sum(topCounts(C[k])) for k in C]

Output:

data = {0: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
        1: {0: [11, 12],
            1: {0: [13, 14],
                1: [15, 16],
                 2: [17, 18, 19, 20],
                 3: {0: [43, 34], 1: [4], 2: [2, 3, 4, 5]}}}}
    
print(topCounts(data))
print(topCounts(data[1][1]))

[10, 17]
[2, 2, 4, 7]