Home > Software engineering >  Left associative binary tree fold
Left associative binary tree fold

Time:11-27

With the normal right associative tree fold I can define pre-/in-/post-order just by rearranging the concatenation in the supplied function f:

const treeFoldr = f => acc => function go(t) {
  if (t[TAG] === "Leaf") return acc;
  else return f(t.v) (go(t.l)) (go(t.r));
};

const TAG = Symbol.toStringTag;

const N = (l, v, r) => ({[TAG]: "Node", l, v, r});
const L = {[TAG]: "Leaf"};

const foo = N(
  N(
    N(L, 4, L),
    1,
    N(L, 5, L)
  ),
  0,
  N(
    L,
    2,
    N(L, 3, L)
  )
);

const r1 = treeFoldr(
  x => acc1 => acc2 => {return [x].concat(acc1).concat(acc2)}) ([]) (foo); // pre-order

const r2 = treeFoldr(
  x => acc1 => acc2 => {return acc1.concat([x]).concat(acc2)}) ([]) (foo); // in-order
  
const r3 = treeFoldr(
  x => acc1 => acc2 => {return acc1.concat(acc2).concat([x])}) ([]) (foo); // post-order

console.log(r2); // in-order [4,1,5,0,2,3]
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

Now I guess the same should be possible with a left associative fold, where f's result is passed to the next recursive go invocation. But all I could come up with is this hard-coded pre-order fold:

treeFoldl = f => acc => function go(t) {
  if (t[TAG] === "Leaf") return acc;
  
  else {
    acc = f(acc) (t.v);
    acc = go(t.l);
    return go(t.r);
  }
};

In order to get the desired design I would somehow have to incorporate two accumulators (because of f's signature) and the recursive calls of the left/right node, but I don't have a clue how.

It's probably quite easy, but I just don't see the wood for the trees..

[EDIT]

As requested in the comments here is a pure version of treeFoldl:

const treeFoldl = f => init => t => function go(acc, u) {
  if (u[TAG] === "Leaf") return acc;
  
  else {
    const acc_ = go(f(acc) (u.v), u.l);
    return go(acc_,   u.r);
  }
} (init, t);

CodePudding user response:

With the normal right associative tree fold I can define pre-/in-/post-order just by rearranging the concatenation in the supplied function f

What you have defined there is not a foldR function. It is actually a catamorphism for your tree structure (see also this answer), and has exactly this advantage that you can implement anything with it. You can implement fmap, constructing a new tree of the same shape. Or you can implement the linear left fold and right fold:

// cata :: ((b, a, b) -> b), () -> b) -> Tree a -> b
const cata = (onNode, onLeaf) => tree => {
  if (t[TAG] === "Leaf")
    return onLeaf();
  else
    return onNode(
      cata(onNode, onLeaf)(t.l),
      t.v,
      cata(onNode, onLeaf)(t.r)
    );
};

// foldR :: ((a, b) -> b) -> b -> Tree a -> b
const foldR = f => init => t => cata(
  (l, v, r) => acc => l(f(v, r(acc)),
  () => acc => acc
)(t)(init);

// foldL :: ((b, a) -> b) -> b -> Tree a -> b
const foldL = f => init => t => cata(
  (l, v, r) => acc => r(f(l(acc), v),
  () => acc => acc
)(t)(init);

Without the catamorphism, the implementations should look like this:

// foldR :: ((a, b) -> b) -> b -> Tree a -> b
const foldR = f => init => t => {
  if (t[TAG] === "Leaf")
    return init;
  else {
    const acc1 = foldR(f)(init)(t.r);
    const acc2 = f(t.v, acc1);
    return foldR(f)(acc2)(t.l);
  }
};

// foldL :: ((b, a) -> b) -> b -> Tree a -> b
const foldL = f => init => t => {
  if (t[TAG] === "Leaf")
    return init;
  else {
    const acc1 = foldL(f)(init)(t.l);
    const acc2 = f(acc1, t.v);
    return foldL(f)(acc2)(t.r);
  }
};

Neither of these have a pre-order or post-order variant, the information about the tree shape is lost to the reducing function. A linear fold always is in-order, just with different associativity between the left and right variants.

CodePudding user response:

wishful thinking

Given the various user-supplied folding functions -

traversal user-supplied f
preorder l => v => r => [v].concat(l).concat(r)
inorder l => v => r => l.concat([v]).concat(r)
postorder l => v => r => l.concat(r).concat([v])

If we want our wishes to come true, we can deduce that l and r must already be arrays, otherwise Array.prototype.concat is not available.

Given your tree -

    0
   / \
  1   2
 / \   \
4   5   3

It is the nature of these folding functions, f, to flatten a compound node value into a single array value. After the first node, regardless of the f traversal chosen, the output will be [0], which is now the accumulator for the next nodes -

forked accumulator

               f([], 0, [])
             /              \
            /                \
    f([0], 1, [0])     f([0], 2, [0])
    /            \    /              \
   ...           ... ...             ... 

Here we should already be able to see an issue. The accumulator has been flattened and where "left" and "right" are in relation to the zero is lost information. Worse the accumulator has been forked and we will need some merging solution to reconcile this split. How can the merging solution work without enforcing some kind of ordering to ensure its intended result?

sequenced folds

If you say, "no, we will not fork the accumulator," and instead sequence the calls to f, do we fold the left branch first? Do we fold the right branch first?

              f(..., 0, ...)
             /              \
            /                \
    f(..., 1,          f(..., 2, ...))
    /                                \
   ?               ???                ? 

Even if we could manage to make these valid calls to f, selecting the left or right branch first is the ordering.

meta-circular evaluator

So we've hit a dead-end twice now. Okay, let's go back to the beginning and suppose we could change the user-supplied traversals -

traversal user-supplied f
preorder l => v => r => [v, l, r]
inorder l => v => r => [l, v, r]
postorder l => v => r => [l, r, v]

In this arrangement, l and r are no longer constrained to arrays, and instead they could be anything. They could objects, identifiers, or other type of sentinel that treeFoldL recognizes and uses to guide its growing computational process. As you know with enough engineering, we can make it do whatever we want -

    0
   / \
  1   2
 / \   \
4   5   3

For example, consider this inorder traversal -

[ L, 0, R ]
[ ...[L, 1, R], 0, R ]
[ L, 1, R, 0, R ]
[ ...[L, 4, R], 1, R, 0, R ]
[ L, 4, R, 1, R, 0, R ]
[ ...[], 4, R, 1, R, 0, R ]
[ 4, R, 1, R, 0, R ]
[ 4, ...[], 1, R, 0, R ]
[ 4, 1, R, 0, R ]
[ 4, 1, ...[L, 5, R], 0, R ]
[ 4, 1, L, 5, R, 0, R ]
[ 4, 1, ...[], 5, R, 0, R ]
[ 4, 1, 5, ...[], 0, R ]
[ 4, 1, 5, 0, R ]
[ 4, 1, 5, 0, ...[L, 2, R] ]
[ 4, 1, 5, 0, L, 2, R ]
[ 4, 1, 5, 0, ...[], 2, R ]
[ 4, 1, 5, 0, 2, R ]
[ 4, 1, 5, 0, 2, ...[L, 3, R] ]
[ 4, 1, 5, 0, 2, L, 3, R ]
[ 4, 1, 5, 0, 2, ...[], 3, R ]
[ 4, 1, 5, 0, 2, 3, R ]
[ 4, 1, 5, 0, 2, 3, ...[] ]
[ 4, 1, 5, 0, 2, 3 ]

To keep this generic, this requires that an ordering function is supplied separately from the folding function, f, which is now a familiar 2-arity interface -

const treeFoldl = ordering => f => init => t =>
  // ...

const inorder =
  treeFoldL (l => v => r => [l, v, r])

const concat = a => b =>
  a.concat(b)
const toArray =
  inorder (concat) ([])

toArray (myTree) // => [...]
const toString =
  inorder (concat) ("")

toString (myTree) // => "..."

back to planet earth

So you could go through the trouble and write treeFoldl in such a way however it overlooks the most obvious implementation, imo. From where I see it, there is no treeFoldl and treeFoldr, instead there is only preorder, inorder, postorder, levelorder, whateverorder. For purposes of associativity, there is foldl and foldr -

Here's a generic foldl that accepts any iterable, it -

const foldl = f => init => it => {
  let acc = init
  for (const v of it)
    acc = f(acc, v)
  return acc
}

And some possible traversals for your binary tree, here written as generators but you can use whatever implementation you like, stack-safe or otherwise -

function *preorder(t) {
  if (t?.[TAG] == "Leaf") return
  yield t.v
  yield *preorder(t.l)
  yield *preorder(t.r)
}

function *inorder(t) {
  if (t?.[TAG] == "Leaf") return
  yield *inorder(t.l)
  yield t.v
  yield *inorder(t.r)
}

function *postorder(t) {
  if (t?.[TAG] == "Leaf") return
  yield *postorder(t.l)
  yield *postorder(t.r)
  yield t.v
}

function *levelorder(t) {
  // pop quiz: 
  // how would you write levelorder using your original interface?
  // ...
}

Instead of tangling left/right associativity in your traversals, these things are kept separate and remain a choice of the caller. Using them is straightforward -

// left
foldl (concat) ([]) (preorder(t)) // => [...]
foldl (concat) ([]) (inorder(t))  // => [...]
foldl (concat) ([]) (postorder(t)) // => [...]

// or right
foldr (concat) ([]) (preorder(t)) // => [...]
foldr (concat) ([]) (inorder(t))  // => [...]
foldr (concat) ([]) (postorder(t)) // => [...]
// left
foldl (concat) ("") (preorder(t)) // => "..."
foldl (concat) ("") (inorder(t))  // => "..."
foldl (concat) ("") (postorder(t)) // => "..."

// or right
foldr (concat) ("") (preorder(t)) // => "..."
foldr (concat) ("") (inorder(t))  // => "..."
foldr (concat) ("") (postorder(t)) // => "..."

The advantages of separation of concerns are numerous. foldl and foldr are suitably generic to work with any sequenceable data type. And it is the domain of that data type to specify the ways it which it can be sequenced.

  • Related