Home > Software engineering >  Haskell lazyness, evaluation order and pattern matching
Haskell lazyness, evaluation order and pattern matching

Time:11-28

I start by saying that I'm learning Haskel so don't be too harsh.

Haskell's lazy evaluation may be useful or dangerous depending on whether a computation's bottleneck is the time complexity or the size of the stack.

For this reason I would like to understand better how evaluation works in Haskell. Let me propose an example. Consider this simple function

isBig :: Int -> Bool
isBig = (<=) 5

With this, let us define the function

anyBig :: [Int] -> Bool
anyBig = or . map isBig

Due to lazy evaluation, I would expect that [7,1,2,1] will evaluate to True by calling isBig only once. Let's see the evaluation order that GHC might do to figure this out

anyBig [7,1,2,1]                                    -- start
(or . map isBig) [7,1,2,1]                          -- definition of anyBig
or $ map isBig [7,1,2,1]                            -- definition of composition
foldr (||) False $ isBig 7 : map isBig [1,2,1]      -- or and map evaluated in some order
False || foldr (||) (isBig 7) (map isBig [1,2,1])   -- definition of foldr
foldr (||) (isBig 7) (map isBig [1,2,1])            -- definition of ||

Now at this point there are a few possibilities

  1. GHC is smart enough to figure out that the only thing preventing a pattern matching is the last argument so it does
foldr (||) (isBig 7) (isBig 1 : map isBig [2,1])     -- definition of map
(isBig 7) || foldr (||) (isBig 1) (map isBig [2,1])  -- definition of foldr
  1. GHC evaluates everything at the same level since there are no matches
foldr (||) True (isBig 1 : map isBig [2,1])          -- definition of map
True || foldr (||) (isBig 1) (map isBig [2,1])       -- definition of foldr

Now in case 2., by the definition of || the computation stops with True. In case 1. it is unclear to me if GHC would prefer to unwrap the map to get to

(isBig 7) || (isBig 1) || (isBig 2) || (isBig 1) 

or if it evaluates isBig 7 first and return as in 2. Also, suppose we truly ended up with that

(isBig 7) || (isBig 1) || (isBig 2) || (isBig 1) 

that would actually be maximally show because || is infixr so it starts from the right.

CodePudding user response:

The (||) function is implemented as [src]:

(||)                    :: Bool -> Bool -> Bool
True  || _              =  True
False || x              =  x

This means that it looks at the first operand, and if it is True, it does not pattern matching on the second one, and thus will never evaluate the rest. Only in case the first one is False, it will return the second operand, and this will thus continue evaluation.

This thus means that:

    foldr (||) False (map isBig [7,1,2,1])
 -> foldr (||) False (isBig 7 : map isBig [1,2,1])
 -> isBig 7 || (foldr (||) False (map isBig [1,2,1]))
 -> True || (foldr (||) False (map isBig [1,2,1]))
 -> True

If foldr thus uses a function that can (sometimes) return a result by only evaluating the first operand, it can thus stop recursion.

We can see this when we would pass an infinite list. Indeed, for example:

anyBig (1 : 7 : repeat 1)

Here the list is [1, 7, 1, 1, 1, …], so an infinite amount of elements. But since isBig 7 is True, it will stop and return True:

ghci> anyBig (1 : 7 : repeat 1)
True
  • Related