Home > Software engineering >  Understanding non-strictness in Haskell with a recursive example
Understanding non-strictness in Haskell with a recursive example

Time:10-16

What is the difference between this two, in terms of evaluation?

Why this "obeys" (how to say?) non-strictness

recFilter :: (a -> Bool) -> [a] -> [a]
recFilter _ [] = []
recFilter p (h:tl) = if (p h) 
  then h : recFilter p tl
  else recFilter p tl

while this doesn't?

recFilter :: (a -> Bool) -> [a] -> Int -> [a]
recFilter _ xs 0 = xs
recFilter p (h:tl) len
  | p(h)  = recFilter p (tl    [h]) (len-1)
  | otherwise = recFilter p tl (len-1)

Is it possible to write a tail-recursive function non-strictly?

To be honest I also don't understand the call stack of the first example, because I can't see where that h: goes. Is there a way to see this in ghci?

CodePudding user response:

The non-tail recursive function roughly consumes a portion of the input (the first element) to produce a portion of the output (well, if it's not filtered out at least). Then recursion handles the next portion of the input, and so on.

Your tail recursive function will recurse until len becomes zero, and only at that point it will output the whole result.

Consider this pseudocode:

def rec1(p,xs):
   case xs:
      []     -> []
      (y:ys) -> if p(y): print y
                rec1(p,ys)

and compare it with this accumulator-based variant. I'm not using len since I use a separate accumulator argument, which I assume to be initially empty.

def rec2(p,xs,acc):
   case xs:
      []     -> print acc
      (y:ys) -> if p(y): 
                   rec2(p,ys,acc  [y])
                else:
                   rec2(p,ys,acc)

rec1 prints before recursing: it does not need to inspect the whole input list to start printing its output. It works in a "steraming" fashion, in a sense. Instead, rec2 will only start to print at the very end, after the input list was completely scanned.

In your Haskell code there are no prints, of course, but you can thing of returning x : function call as "printing x", since x is made available to the caller of our function before function call is actually made. (Well, to be pedantic this depends on how the caller will consume the output list, but I'll neglect this.)

Hence the non-tail recursive code can also work on infinite lists. Even on finite inputs, performance is improved: if we call head (rec1 p xs), we only evaluate xs until the first non-discarded element. By contrast head (rec2 p xs) would fully filter the whole list xs, even we don't need that.

CodePudding user response:

The second implementation does not make much sense: a variable named len will not contain the length of the list. You thus need to pass this, for infinite lists, this would not work, since there is no length at all.

You likely want to produce something like:

recFilter :: (a -> Bool) -> [a] -> [a]
recFilter p = go []
    where go ys [] = ys  -- (1)
          go ys (x:xs) | p x = go (ys    [x]) xs
                       | otherwise = go ys xs

where we thus have an accumulator to which we append the items in the list, and then eventually return the accumulator.

The problem with the second approach is that as long as the accumulator is not returned, Haskell will need to keep recursing until at least we reach weak head normal form (WHNF). This means that if we pattern match the result with [] or (_:_), we will need at least have to recurse until case one, since the other cases only produce a new expression, and it will thus not yield a data constructor on which we can pattern match.

This in contrast to the first filter where if we pattern match on [] or (_:_) it is sufficient to stop at the first case (1), or the third case 93) where the expression produces an object with a list data constructor. Only if we require extra elements to pattern match, for example (_:_:_), it will require to evaluate the recFilter p tl in case (2) of the first implementation:

recFilter :: (a -> Bool) -> [a] -> [a]
recFilter _ [] = []  -- (1)
recFilter p (h:tl) = if (p h) 
  then h : recFilter p tl  -- (2)
  else recFilter p tl

For more information, see the Laziness section of the Wikibook on Haskell that describes how laziness works with thunks.

  • Related