I have a dictionary with thousands of elements of the following structure:
full_dict={'A': ['B','C','D','E'],
'B':['A','C','D','E','F','X']
'X':['W','Y','Z','S'],
'S':['W','K','T'],
...}
where every letter is a word. Every word can be both a key and a value (together with other words) for another dict element.
I am trying to find a "path" from one word to another. For example a path from 'A' to 'S' is A-B-X-S as S is among values of X, X in B and B in A.
Currently, I am using this code:
query=['A','S']
if 'S' in full_dict['A']:
print ('Found during 1st iteration')
else:
for i in full_dict['A']:
if 'S' in full_dict[i]:
print ('Found during 2nd iteration')
else:
for ii in full_dict[i]:
etc.
I do 10 iterations and it works fine but I wonder if there is a better way to do it. Thank you in advance.
CodePudding user response:
You can use networkx
package:
import networkx as nx
full_dict={'A': ['B','C','D','E'],
'B': ['A','C','D','E','F','X'],
'X': ['W','Y','Z','S'],
'S': ['W','K','T'],
}
g = nx.DiGraph() # directed graph
for k, v in full_dict.items():
g.add_edges_from((k, to) for to in v)
print(*nx.all_simple_paths(g, 'A', 'S')) # ['A', 'B', 'X', 'S']
A (nested) loop wouldn't work in general because it will involve a lot of nests (at best), and at worst, we don't generally know how many nests we would need.
Because of this, one could think of recursion instead. But this could be complicated since you might need to put much effort into things like breaking potential infinite loops or remembering intermediate results (to reduce computations).
So in my opinion, the best approach is a package that readily makes use of the graph structure of your data.
CodePudding user response:
External libs are useful if you're doing this for production and need the high performance, optimizations, all possible paths, etc.
If you're doing this as an exercise or something where you can be sure of clean input (no loops within the word lists) or don't want an external lib for just one operation, then you can try a recursion:
full_dict = {
'A': ['B','C','D','E'],
'B': ['A','C','D','E','F','X'],
'X': ['W','Y','Z','S'],
'S': ['W','K','T'],
}
def find_path(source, target):
for key, words in full_dict.items():
if target in words:
yield target
if key == source: # check if it's source
yield source
else:
# otherwise, look for this key as new target
yield from find_path(source, key)
break # prevent further checking of remaining items
Usage:
list(find_key('A', 'S'))
# ['S', 'X', 'B', 'A']
# get it in the right order, reverse it
list(find_key('A', 'S'))[::-1]
# ['A', 'B', 'X', 'S']
The approach looks for the target word in the word list and then yields each key. Rather than check the word lists possible for every word in the source - since that creates many chains that may or may not get to the target.
More info on yield
expressions.
Note that Python3's recursion limit is 1000, by default. So if the word chain may be longer, either increase the limit or use another method. Setting the limit to 4 and using this modified dict full_dict = {'A': ['B', 'D', 'E'], 'B': ['A', 'D', 'E', 'F', 'X'], 'X': ['W', 'Y', 'Z', 'C'], 'C': ['W', 'K', 'T', 'S'], 'S': []}
, you'd hit the limit for A to S which is now a chain of 5 (['A', 'B', 'X', 'C', 'S']
).
CodePudding user response:
You could make it a recursive function:
def findLink(path,end,links):
if not isinstance(path,list): path = [path] # start path with initial word
for linked in links.get(path[-1],[]): # go through links of word
if linked == end: return path [end] # reached end, return full path
if linked in path: continue # avoid circular linking
foundPath = findLink(path [linked],end,links) # dig deeper
if foundPath: return foundPath # reached end on that path
output:
full_dict={'A': ['B','C','D','E'],
'B':['A','C','D','E','F','X'],
'X':['W','Y','Z','S'],
'S':['W','K','T'] }
print(findLink('A','S',full_dict))
['A', 'B', 'X', 'S']