An algorithm to solve sets of relations-CodePudding

I am writing a program to process various equations. I have a list of equations, and a function that can solve a given equation for an unknown value.

Eg: a b c = 0 -> can be solved for any variable, given the other 2.

I now need to write an algorithm that takes the list of relations, a list of known values and a target, then finds a series of steps that must be taken to find the unknown.

Eg: (a = b, b = c) a = 1, find c:
Result would show that you first convert a -> b with "a = b", then b -> c with "b = c"

Help with the algorithm or search terms I could research would be greatly appreciated

CodePudding user response：

Assume we have:

a list of equations eqs;
an initial list of variables xs;
a function f(eq, xs) that, given an equation eq and a list of variable names xs, returns the list ys of variables whose value can be found using equation eq if we know the values of variables in xs;
a target variable target.

Then your problem is somewhat of a "shortest path" problem, trying to find a path from an initial list of variables, to a list that contains the target variable.

We can perform a breadth-first-search, trying to expand the list of known variables until it contains the target.

eqs = '''3 * a   2 * b   c == 0
a   b = 12
c   4 * b = 4'''.split('\n')

def f(eq, xs, all_variables='abc'):
    ys = set(eq).intersection(all_variables).difference(xs)
    return ys.pop() if len(ys) == 1 else None

target = 'c'

xs = ['a']

def find_steps(eqs, xs, f, target):
    if target in xs:
        return (('', xs),)
    paths = [(('', frozenset(xs)),)]
    for _ in range(len(eqs)):
        new_paths = []
        for path in paths:
            _,xs = path[-1]
            for eq in eqs:
                y = f(eq, xs)
                if y is not None:
                    new_path = path   ((eq, xs.union({y})),)
                    if y == target:
                        return new_path
                    new_paths.append(new_path)
        paths = new_paths
    return None

print( find_steps(eqs, xs, f, target) )
# (('', {'a'}),
#  ('a   b = 12', {'b', 'a'}),
#  ('3 * a   2 * b   c == 0', {'b', 'c', 'a'}))

A few explanations. The outer loop for _ in range(len(eqs)): is conceptually not a for-loop. It's a loop that runs until either the target is found (in which case the loop is broken by return new_path) or until the loop has run for too long without finding a target: the longest path cannot be longer than the number of equations. Or can it? If it can, the stopping condition needs to be modified.

We implement a breadth-first-search. At iteration i of the outer loop, paths is the list of all paths of length i. For each path of length i, we try to extend it into paths of length i 1 in all ways that we can, and add these new paths to new_paths.

A path is a list of steps. A step is a pair (equation, known_variables). The initial step is always ('', initial_known_variables). We only extend a path if the new step contains a new known variable.

CodePudding user response：

You build a dependency graph. Then you fill in given variables, compute what depends on the, compute what depends on what you've just computed, etc.

This of course doesn't always work, and you may come to a dead end, immediately or after some such steps. The simplest example is a system of equations:

a   b   c   3 = 0
a - b   c - 5 = 0

and you are given c = 42, what's a? Or worse, what's some other variable that depends on a (and may be a part of some other system of equations)? You can perform Gaussian elimination here to compute a, but what if you have a non-linear system, or equations you cannot analyze? You need to send a bunch of equations to a solver then.

So now you have not variables that depend on other variables, but sets of variables that depend on other sets. In the example above, {a, b} depends on c (and {a, c} depends on b, and {b, c} depends on a).

One could be tempted to run some kind of shortest-path algorithm on a graph of sets of variables. However, in a system of 50 equations with 50 variables the number of sets that depend on other sets is beyond our abilities to compute. And yet an entire system can be feasible to solve all at once numerically.

So there is a generic method of solving this. Just feed it all to the solver as one big system of equations. You get back a bunch of variable assignments, without any clear sequence of solution steps (unless you consider iterations of Newton-Raphson or whatever "solution steps"). And then there are specific ways to solve specific restricted kinds of systems that give you back clear solution steps (linear systems, systems of single-variable equations, and their combinations). And of course you can have some clearly defined solution steps, then solve a big set of equations all at once because you have no other way to solve them, then have some more clear steps, then solve a different system of equations, etc etc.

So you can try to walk the dependency graph and produce solution steps, but there is always a chance you will get stuck at some big system that you can only solve all at once. Worse, there may be different ways that lead to a solution, all through solving different systems of equations, and of course some systems do not behave nicely (have unstable solutions, or solutions that are hard to guess a decent approximation to, or whatever).

Might as well save all this trouble, throw everything you have to the numeric solver, and let it sort out the mess.

(By equation I mean any object that looks like 'f(x, y, ...) = 0', where f is some function. Sometimes you get to know what the function is, i.e. it is given to you symbolically, and sometimes you don't, i.e. it is computed by an external algorithm opaque to you).