Home > Software engineering >  What is the intuition behind recursive algorithms with Streams?
What is the intuition behind recursive algorithms with Streams?

Time:06-02

Like the title says what is the intuition behind recursive algos with streams like:

val fibs: LazyList[Int] = (0 #:: fibs).scanLeft(1)(_   _)

and

val fibs: LazyList[Int] = 0 #:: 1 #:: (fibs.zip(fibs.tail).map{ t => t._1   t._2 })

How do they unfold? What is the base case for such algos (if it's Nil, why it's so?) and how do they progress towards fibs.take(5) e.g.?

CodePudding user response:

How do they unfold?

They don't. The #:: function takes a by-name argument, which means that it's evaluated lazily.

What is the base case for such algos (if it's Nil, why it's so?).

There is no "base case", these recursive definitions yield infinite streams:

scala> val fibs: LazyList[Int] = (0 #:: fibs).scanLeft(1)(_   _)
val fibs: LazyList[Int] = LazyList(<not computed>)

scala> fibs.size
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

(Note the "<not computed>" token, which hints at the laziness)

CodePudding user response:

It's say there are 2 things at play here:

  • recursive syntax making use of LazyList API
  • corecursive mathematics behind unfolding

So, let's start with a few words about API and syntax:

  • #:: takes lazy value and prepends it to LazyList definition, here it is fibs which makes its definition recursive on code level
  • LazyList lazily evaluates its arguments and then caches/memoizes them for future use letting us access already computed values immediately

However, the mechanism underneath is actually corecursive.

Let's see what is recursion when it comes to data using List as an example:

List(1,2,3,4)

This can be also written as

1 :: 2 :: 3 :: 4 :: Nil

Which is the same as

( ( ( Nil.::(4) ).::(3) ).::(2) ).::(1)

You can see that we:

  • take Nil
  • create ::(4, Nil) value which we use to
  • create ::(3, ::(4, Nil)) value
  • and so on

In other words, we have to start with some base case and build the whole things from-bottom-up. Such values by definition have to be finite and cannot be used to express series of (possibly) infinite computation.

But there exist an alternative which allows you to express such computations - corecursion and codata.

With corecursion you have a tuple:

  • the last computed value
  • a function which can take the value and return the next tuple (next value next function!)
  • nothing prevent you from using the same function as second element of the tuple but it's good to have a choice

For instance you could define infinite series of LazyList(1, 2, 3, 4, 5, 6, ...) like:

// I use case class since
//   type Pair = (Int, Int => Pair)
// would be illegal in Scala 
final case class Pair(value: Int, f: Int => Pair)
val f: Int => Pair = n => Pair(n   1, f)
Pair(1, f)

Then you would take Pair, get value out of it (1 initially) and use it to generate new Pairs (Pair(2, f), Pair(3, f), ...).

Structure which would use corecursion to generate its values would be called codata (so LazyList can be considered codata).

Same story with Fibonacci sequence, you could define it corecursively with

  • (Int, Int) as value (initialized to (0, 1)
  • val f: (Int, Int) => Pair = { case (n, m) => Pair((m, n m), f } as function
  • finally, you'd have to pick _1 out of every generated (Int, Int) pair

However, LazyList's API gives you some nice tools so that you don't have to do this manually:

  • it memoizes (caches) computed values so you can access list(0), list(1), etc, they aren't forgotten right after use
  • it gives you methods like .map, .flatMap .scanLeft and so on, so while internally it might have more complex types used for corecursion, you are only seeing the final result that you need

Obviously, all of that is done lazily, by codata's definition: at each step you can only know values defined so far, and how to generate next of out it.

That leads us to your example:

val fibs: LazyList[Int] = (0 #:: fibs).scanLeft(1)(_   _)

You can think of it as something that:

  • starts with a pair (0, f)
  • where the f takes this 0 argument, and combines it with 1 to create (0, 1) tuple
  • and then constructs next fs which trace the previous value, and passes it along current value to the function passed into scanLeft
  • where all the shenanigans with intermediate values and functions and memoization are handled internally by API

So if you asked me, the "base case" of such algos is a pair of value and function returning pair, run over and over again.

  • Related