Like the title says what is the intuition behind recursive algos with streams like:
val fibs: LazyList[Int] = (0 #:: fibs).scanLeft(1)(_ _)
and
val fibs: LazyList[Int] = 0 #:: 1 #:: (fibs.zip(fibs.tail).map{ t => t._1 t._2 })
How do they unfold? What is the base case for such algos (if it's Nil
, why it's so?) and how do they progress towards fibs.take(5)
e.g.?
CodePudding user response:
How do they unfold?
They don't. The #::
function takes a by-name argument, which means that it's evaluated lazily.
What is the base case for such algos (if it's Nil, why it's so?).
There is no "base case", these recursive definitions yield infinite streams:
scala> val fibs: LazyList[Int] = (0 #:: fibs).scanLeft(1)(_ _)
val fibs: LazyList[Int] = LazyList(<not computed>)
scala> fibs.size
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
(Note the "<not computed>"
token, which hints at the laziness)
CodePudding user response:
It's say there are 2 things at play here:
- recursive syntax making use of
LazyList
API - corecursive mathematics behind unfolding
So, let's start with a few words about API and syntax:
#::
takes lazy value and prepends it toLazyList
definition, here it isfibs
which makes its definition recursive on code levelLazyList
lazily evaluates its arguments and then caches/memoizes them for future use letting us access already computed values immediately
However, the mechanism underneath is actually corecursive.
Let's see what is recursion when it comes to data using List
as an example:
List(1,2,3,4)
This can be also written as
1 :: 2 :: 3 :: 4 :: Nil
Which is the same as
( ( ( Nil.::(4) ).::(3) ).::(2) ).::(1)
You can see that we:
- take
Nil
- create
::(4, Nil)
value which we use to - create
::(3, ::(4, Nil))
value - and so on
In other words, we have to start with some base case and build the whole things from-bottom-up. Such values by definition have to be finite and cannot be used to express series of (possibly) infinite computation.
But there exist an alternative which allows you to express such computations - corecursion and codata.
With corecursion you have a tuple:
- the last computed value
- a function which can take the value and return the next tuple (next value next function!)
- nothing prevent you from using the same function as second element of the tuple but it's good to have a choice
For instance you could define infinite series of LazyList(1, 2, 3, 4, 5, 6, ...) like:
// I use case class since
// type Pair = (Int, Int => Pair)
// would be illegal in Scala
final case class Pair(value: Int, f: Int => Pair)
val f: Int => Pair = n => Pair(n 1, f)
Pair(1, f)
Then you would take Pair
, get value
out of it (1
initially) and use it to generate new Pair
s (Pair(2, f)
, Pair(3, f)
, ...).
Structure which would use corecursion to generate its values would be called codata (so LazyList
can be considered codata).
Same story with Fibonacci sequence, you could define it corecursively with
(Int, Int)
as value (initialized to(0, 1)
val f: (Int, Int) => Pair = { case (n, m) => Pair((m, n m), f }
as function- finally, you'd have to pick
_1
out of every generated(Int, Int)
pair
However, LazyList
's API gives you some nice tools so that you don't have to do this manually:
- it memoizes (caches) computed values so you can access
list(0)
,list(1)
, etc, they aren't forgotten right after use - it gives you methods like
.map
,.flatMap
.scanLeft
and so on, so while internally it might have more complex types used for corecursion, you are only seeing the final result that you need
Obviously, all of that is done lazily, by codata's definition: at each step you can only know values defined so far, and how to generate next of out it.
That leads us to your example:
val fibs: LazyList[Int] = (0 #:: fibs).scanLeft(1)(_ _)
You can think of it as something that:
- starts with a pair
(0, f)
- where the
f
takes this0
argument, and combines it with1
to create(0, 1)
tuple - and then constructs next
f
s which trace the previous value, and passes it along current value to the function passed intoscanLeft
- where all the shenanigans with intermediate values and functions and memoization are handled internally by API
So if you asked me, the "base case" of such algos is a pair of value and function returning pair, run over and over again.