Home > database >  Finding if a path between two words in a list exists
Finding if a path between two words in a list exists

Time:02-25

I have a problem that I am trying to solve, which is, I have a list of words:

word_list = ["DATA", "TATA", "PAPA", "PATA", "TOTO", "TITI", "TATI", "TUTO", "DARA", "DORA"]

I want to return True if a path between two words exists. The change from one word to another should be by replacing one char. For instance, the paths between "DATA" and "PAPA" are:

DATA -> TATA -> PATA -> PAPA

DATA -> PATA -> PAPA

The code below prints all the existing paths, where I used the recursion function to do that. How can I change it to return True|False if a path exists or not (inside find_graph_words function)? The recursion function confuses me sometimes when I want to use return.

def count_diff(source, target):
    diff = 0
    for c1, c2 in zip(source, target):
        if c1 != c2:
            diff  = 1
    return diff

def compute_distances(word_list):
    distances = {}
    for word in word_list:
        for target in word_list:
            if word in distances:
               maps = distances[word]
               maps[target] = count_diff(word, target)
               distances[word] = maps
            else:
               distances[word] = {target: count_diff(word, target)}
    return distances

def find_graph_words(source, target, distances, path):
    path  = source   ' -> '
    to_visit = [item for item in word_list if (distances[source][item] == 1) and (item not in path)]
    if target in to_visit:
        path  = target
        print(path)
    for node in to_visit:
        find_graph_words(node, target, distances, path)

if __name__ == '__main__':
    word_list = ["DATA", "TATA", "PAPA", "PATA", "TOTO", "TITI", "TATI", "TUTO", "DARA", "DORA"]
    distances = compute_distances(word_list)
    find_graph_words("DATA", "PAPA", distances, '')

CodePudding user response:

In your existing code, item not in path could give wrong outcomes when shorter strings are found in longer strings (if that is possible to have in your graph). It would be better to use a list or set structure for your path.

For testing the existence only of a path, you can exit the loop as soon as you get a hit from recursion:

def connected(source, target, distances, path):
    to_visit = [item for item in word_list if distances[source][item] == 1 and item not in path]
    if target in to_visit:
        return True
    for node in to_visit:
        if connected(node, target, distances, path   [node]):
            return True
    return False

This can be reduced to:

def connected(source, target, distances, path):
    to_visit = [item for item in word_list if distances[source][item] == 1 and item not in path]
    return target in to_visit or any(
        connected(node, target, distances, path   [node])
        for node in to_visit
    )

word_list = ["DATA", "TATA", "PAPA", "PATA", "TOTO", "TITI", "TATI", "TUTO", "DARA", "DORA"]
distances = compute_distances(word_list)
print(connected("DATA", "PAPA", distances, []))

You can improve the efficiency of determining to_visit, by creating an adjacency list first, instead of a distances matrix, where the adjacency list will only have the edges where the distance is 1.

  • Related