Home > Software engineering >  Recursively search a dictionary of lists to find all connections between keys of that dictionary
Recursively search a dictionary of lists to find all connections between keys of that dictionary

Time:11-08

I have a dictionary of lists, whose values are connections to other keys in that same dictionary. I need a summarized list of the connections between all of the keys regardless of how many levels the keys are away from each other.

I can make this work with a bunch of for loops that keep searching back through the same dictionary, but I can't get this to work as a recursive function that combines all of the lists.

input:

   sampleDict = {'1': ['2', '3'],
                 '2': ['1', '4'] ,
                 '3': ['1'],
                 '4': ['2'],
                 '5': ['6'],
                 '6': ['5', '7'],
                 '7': ['6', '8', '9'],
                 '8': ['7', '10'],
                 '9': ['7', '11'],
                 '10': ['8'],
                 '11': ['9'],
                 '12': [],
                 '13': ['14'],
                 '14': ['13']
                 }

Expected output:

outputDict = {1: ['1', '2', '3', '4'],
              2: ['5', '6', '7', '8', '9', '10', '11'],
              3: ['12'],
              4: ['13', '14']
              }

Things I have tried:

If I run this code block enough times, I get the answer I want and then de-duplicate the dictionary. But I would prefer something more dynamic that adjusts to different situations and the number of connections between keys.

for i in sampleDict.keys():
    linked = sampleDict[i]
    
    for j in linked:
        sampleDict[i] = list(set(sampleDict[i]   sampleDict[j]))

If I stick that code above in a recursive function, it runs forever because I cannot find a dynamic exit condition that applies once every connection has been made.

def clusters(dictionary):
    for i in dictionary.keys():
        linked = dictionary[i]
        
        for j in linked:
            dictionary[i] = list(set(dictionary[i]   dictionary[j]))
            
            return clusters(dictionary)

CodePudding user response:

Here's a solution using a depth-first search.

def dfs(v,component,unseen,adj):
    for w in adj[v]:
        if w in unseen:
            component.append(w)
            unseen.remove(w)
            dfs(w,component,unseen,adj)

unseen = set(sampleDict)
components = []
while unseen:
    v = unseen.pop()
    components.append([v])
    dfs(v,components[-1],unseen,sampleDict)

components.sort()
outputDict = dict(enumerate(components))

Resulting outputDict is given by

{0: ['4', '2', '1', '3'],
 1: ['14', '13'],
 2: ['6', '5', '7', '8', '10', '9', '11'],
 3: ['12']}

CodePudding user response:

You can try:

sampleDict = {
    "1": ["2", "3"],
    "2": ["1", "4"],
    "3": ["1"],
    "4": ["2"],
    "5": ["6"],
    "6": ["5", "7"],
    "7": ["6", "8", "9"],
    "8": ["7", "10"],
    "9": ["7", "11"],
    "10": ["8"],
    "11": ["9"],
    "12": [],
    "13": ["14"],
    "14": ["13"],
}

tmp = {}
for k, v in sampleDict.items():
    tmp.setdefault(k, set())
    tmp[k].add(k)
    for vv in v:
        tmp.setdefault(vv, tmp[k])
        tmp[k].add(vv)

print(dict(enumerate({id(v): v for v in tmp.values()}.values(), 1)))

Prints:

{
    1: {"2", "4", "1", "3"},
    2: {"6", "7", "10", "8", "5", "9", "11"},
    3: {"12"},
    4: {"13", "14"},
}

If you want lists in output and not sets, then as a last step do:

print(
    dict(
        enumerate({id(v): sorted(v, key=int) for v in tmp.values()}.values(), 1)
    )
)

Prints:

{
    1: ["1", "2", "3", "4"],
    2: ["5", "6", "7", "8", "9", "10", "11"],
    3: ["12"],
    4: ["13", "14"],
}
  • Related