Using recursion, count the number of identical elements in the list and collect this in a dictionary-CodePudding

Please help to realise the following function:

Given a list of strings and lists, which may also contain strings and lists etc. Your job is to collect these strings into a dict, where key would be the string and value the amount of occurrences of that string in these lists.

def count_strings(data: list, pos=None, result: dict = None) -> dict:
    """

    :param data: given list of lists
    :param pos: figure out how to use it
    :param result: figure out how to use it
    :return: dict of given symbols and their count
    """

My attempts:

if result is None and pos is None:
    result = {}
    pos = 0
if pos > len(data):
    return result
if isinstance(data[pos], str):
    return count_strings(data, pos   1, result)
elif isinstance(data, list):
    return count_strings(data, 0, result)

The output should be something like this:

    print(count_strings([[], ["J", "*", "W", "f"], ["j", "g", "*"], ["j", "8", "5", "6", "*"], ["*", "*", "A", "8"]]))
    # {'J': 1, '*': 5, 'W': 1, 'f': 1, 'j': 2, 'g': 1, '8': 2, '5': 1, '6': 1, 'A': 1}
    print(count_strings([[], [], [], [], ["h", "h", "m"], [], ["m", "m", "M", "m"]]))  # {'h': 2, 'm': 4, 'M': 1}
    print(count_strings([]))  # {}
    print(count_strings([['a'], 'b', ['a', ['b']]]))  # {'a': 2, 'b': 2}

CodePudding user response：

The template you got does not promote best practice (see Correct Style for Python functions that mutate the argument), but ignoring that, your code has these issues:

if pos > len(data): this misses the case where these two are equal, in which case you should also enter that if block
Your code does not update the dictionary with actual count(s). In particular this should happen when the inspected value is a string
The test for the list data type is on the wrong value: you'll want to test that data[pos] is a list, not data.
When the value is a list, then you should make the recursive with that sublist, so with data[pos]
When the value is a list, you should still process the rest of the main list.

Here is a correction:

def count_strings(data: list, pos=None, result: dict = None) -> dict:
    if result is None and pos is None:
        result = {}
        pos = 0
    if pos < len(data):  # pos should not be equal to len when accessing data[pos]:
        if isinstance(data[pos], str):
            result[data[pos]] = result.get(data[pos], 0)   1  # increment count
        elif isinstance(data[pos], list):
            count_strings(data[pos], 0, result)  # process nested list
        count_strings(data, pos   1, result)  # process the remaining entries
    return result

Alternative function signature

It would have been better if the function didn't need a result argument that is going to be mutated, nor the pos argument. Although you are not looking for a solution that changes the template you got, I still prefer to include an alternative here.

There are actually two problems to be solved here:

iterate over the nested values in a recursive manner, and
collect string frequencies in a dictionary

The two can be solved in separate functions. Without using libraries, you can do that as follows:

def deepvalues(data):
    if isinstance(data, list):
        for item in data:
            yield from deepvalues(item)  # recursion
    else:
        yield data

def count_strings(data: list) -> dict:
    result = {}
    for value in deepvalues(data):
        result[value] = result.get(value, 0)   1
    return result