Home > Software design >  Python combinations of elements in dict
Python combinations of elements in dict

Time:05-27

This is a tough one. I have a bunch of dicts like the one below (some can be quite large):

V = {
    0: [823, 832, 1151, 1752, 2548, 3036],
    823: [832, 1151, 1752, 2548, 3036, 3551],
    832: [1151, 1752, 2548, 3036, 3551],
    1151: [1752, 2548, 3036, 3551],
    1752: [2548, 3036, 3551, 4622],
    2548: [3036, 3551, 4622],
    3036: [3551, 4622, 5936, 6440],
    3551: [4622, 5936, 6440],
    4622: [5936, 6440, 9001],
    5936: [6440, 9001],
    6440: [9001],
    9001: []
}

The dict represents basic rules to help derive all possible paths (they are bike routes). A path is a sequence of the above ints.

Each value in the list of dict values is also a key.

How do I determine all combinations knowing that for example:

[3036, 4622, 9001] is a valid combination,

But [3036, 9001] is not, and the reason, is that 3036 must be followed by one of the elements in V[3036]. And every combination must contain a compatible sequence, and every sequence must end with 9001, which says, that to get to 9001, one must go via 6440, or 5936 or 4622.

Every sequence must also start with one the points in V[0].

Two things I tried:

  1. I first used itertools.product to derive all combos and then filter the invalid ones, but for most dicts, the number of itertools.product combinations is just too large.
  2. Monte Carlo sims but the number of loops is in the millions with no guarantee to capture all combinations.

CodePudding user response:

Seems like a simple DFS. Since the graph appears to be directed (each node has successors whose numbers are greater than that of the node), you don't even need to be careful to avoid cycles.

>>> def dfs(graph, start, end):
...     if start == end:
...         return [[end]]
...     return [[start]   result for s in graph[start] for result in dfs(graph, s, end)]
...
>>> dfs(V, 0, 9001)
[[0, 823, 832, 1151, 1752, 2548, 3036, 3551, 4622, 5936, 6440, 9001], [0, 823, 832, 1151, 1752, 2548, 3036, 3551, 4622, 5936, 9001], [0, 823, 832, 1151, 1752, 2548, 3036, 3551, 4622, 6440, 9001], [0, 823, 832, 1151, 1752, 2548, 3036, 3551, 4622, 9001], [0, 823, 832, 1151, 1752, 2548, 3036, 3551, 5936, 6440, 9001], [0, 823, 832, 1151, 1752, 2548, 3036, 3551, 5936, 9001], [0, 823, 832, 1151, 1752, 2548, 3036, 3551, 6440, 9001], [0, 823, 832, 1151, 1752, 2548, 3036, 4622, 5936, 6440, 9001], [0, 823, 832, 1151, 1752, 2548, 3036, 4622, 5936, 9001], [0, 823, 832, 1151, 1752, 2548, 3036, 4622, 6440, 9001], [0, 823, 832, 1151, 1752, 2548, 3036, 4622, 9001], [0, 823, 832, 1151, 1752, 2548, 3036, 5936, 6440, 9001], [0, 823, 832, 1151, 1752, 2548, 3036, 5936, 9001], [0, 823, 832, 1151, 1752, 2548, 3036, 6440, 9001], [0, 823, 832, 1151, 1752, 2548, 3551, 4622, 5936, 6440, 9001], [0, 823, 832, 1151, 1752, 2548, 3551, 4622, 5936, 9001], [0, 823, 832, 1151, 1752, 2548, 3551, 4622, 6440, 9001], [0, 823, 832, 1151, 1752, 2548, 3551, 4622, 9001], [0, 823, 832, 1151, 1752, 2548, 3551, 5936, 6440, 9001], [0, 823, 832, 1151, 1752, 2548, 3551, 5936, 9001], [0, 823, 832, 1151, 1752, 2548, 3551, 6440, 9001], [0, 823, 832, 1151, 1752, 2548, 4622, 5936, 6440, 9001], [0, 823, 832, 1151, 1752, 2548, 4622, 5936, 9001], [0, 823, 832, 1151, 1752, 2548, 4622, 6440, 9001], [0, 823, 832, 1151, 1752, 2548, 4622, 9001], [0, 823, 832, 1151, 1752, 3036, 3551, 4622, 5936, 6440, 9001], [0, 823, 832, 1151, 1752, 3036, 3551, 4622, 5936, 9001], [0, 823, 832, 1151, 1752, 3036, 3551, 4622, 6440, 9001], [0, 823, 832, 1151, 1752, 3036, 3551, 4622, 9001], [0, 823, 832, 1151, 1752, 3036, 3551, 5936, 6440, 9001], [0, 823, 832, 1151, 1752, 3036, 3551, 5936, 9001], [0, 823, 832, 1151, 1752, 3036, 3551, 6440, 9001], [0, 823, 832, 1151, 1752, 3036, 4622, 5936, 6440, 9001], [0, 823, 832, 1151, 1752, 3036, 4622, 5936, 9001], [0, 823, 832, 1151, 1752, 3036, 4622, 6440, 9001], [0, 823, 832, 1151, 1752, 3036, 4622, 9001], [0, 823, 832, 1151, 1752, 3036, 5936, 6440, 9001], [0, 823, 832, 1151, 1752, 3036, 5936, 9001], [0, 823, 832, 1151, 1752, 3036, 6440, 9001], [0, 823, 832, 1151, 1752, 3551, 4622, 5936, 6440, 9001], [0, 823, 832, 1151, 1752, 3551, 4622, 5936, 9001], [0, 823, 832, 1151, 1752, 3551, 4622, 6440, 9001], [0, 823, 832, 1151, 1752, 3551, 4622, 9001], [0, 823, 832, 1151, 1752, 3551, 5936, 6440, 9001], [0, 823, 832, 1151, 1752, 3551, 5936, 9001], [0, 823, 832, 1151, 1752, 3551, 6440, 9001], [0, 823, 832, 1151, 1752, 4622, 5936, 6440, 9001], [0, 823, 832, 1151, 1752, 4622, 5936, 9001], [0, 823, 832, 1151, 1752, 4622, 6440, 9001], [0, 823, 832, 1151, 1752, 4622, 9001], [0, 823, 832, 1151, 2548, 3036, 3551, 4622, 5936, 6440, 9001], [0, 823, 832, 1151, 2548, 3036, 3551, 4622, 5936, 9001], [0, 823, 832, 1151, 2548, 3036, 3551, 4622, 6440, 9001], [0, 823, 832, 1151, 2548, 3036, 3551, 4622, 9001], [0, 823, 832, 1151, 2548, 3036, 3551, 5936, 6440, 9001], [0, 823, 832, 1151, 2548, 3036, 3551, 5936, 9001], [0, 823, 832, 1151, 2548, 3036, 3551, 6440, 9001], [0, 823, 832, 1151, 2548, 3036, 4622, 5936, 6440, 9001], [0, 823, 832, 1151, 2548, 3036, 4622, 5936, 9001], [0, 823, 832, 1151, 2548, 3036, 4622, 6440, 9001], [0, 823, 832, 1151, 2548, 3036, 4622, 9001], [0, 823, 832, 1151, 2548, 3036, 5936, 6440, 9001], [0, 823, 832, 1151, 2548, 3036, 5936, 9001], [0, 823, 832, 1151, 2548, 3036, 6440, 9001], [0, 823, 832, 1151, 2548, 3551, 4622, 5936, 6440, 9001], [0, 823, 832, 1151, 2548, 3551, 4622, 5936, 9001], ...]

If the above function spins forever on one of your dicts, then it's time to revise the assumption about the graph being directed.

CodePudding user response:

You can treat the dictionary as an adjacency list. You could use vanilla Python (as in Samwise's answer), but their answer won't work if the graph has cycles. networkx exposes a method for finding the desired paths, so we can use that:

import networkx as nx

graph = nx.DiGraph(V)
for path in nx.all_simple_paths(graph, 0, 9001):
    print(path)

The first and last three lines of the output:

[0, 823, 832, 1151, 1752, 2548, 3036, 3551, 4622, 5936, 6440, 9001]
[0, 823, 832, 1151, 1752, 2548, 3036, 3551, 4622, 5936, 9001]
[0, 823, 832, 1151, 1752, 2548, 3036, 3551, 4622, 6440, 9001]
... [755 more lines]
[0, 3036, 5936, 6440, 9001]
[0, 3036, 5936, 9001]
[0, 3036, 6440, 9001]
  • Related