I am currently learning Haskell and bumped into this expression.

statement
   case 1 ´div´ 0 of _ -> 42

My intuition is that this will result in a runtime error due to division with 0, but from testing this is not the case.

My conclusion is that it must be due to lazy evaluation within Haskell. And that because the _ matches on anything it will not check what it compares to?

So if someone could tell me if this assessment is correct, and if not why. Also please elaborate on what premises a case line could matched without actually looking at the expression.

CodePudding user response：

Your analysis is accurate, but this is actually very fiddly. It's not complicated, but it requires careful consideration of very precise questions.

In Haskell

A case expression has the general form

case exp0 of {pat1 -> exp1 ; ...}

When the case expression is evaluated, exp0 is evaluated far enough to determine if it matches pat1. If it does, the case expression evaluates exp1 with whatever additional bindings pat1 created for it. If it doesn't match, it falls through to the next pattern or guard and repeats. (If none match, it evaluates to an exception.)

Notably, your first alternative matches everything without requiring any evaluation. So no evaluation of div 1 0 is performed. The pattern doesn't bind anything and then the case expression evaluates to 42.

This is different than if you'd done:

case 1 `div` 0 of
    0 -> 42
    _ -> 42

In this case, the div expression would have to be evaluated to see if it was equal to 0. That results in an exception, which would become the result of the entire case expression. This is true even though the two branches have the same value. The content of the patterns is very important for determining behavior related to strictness.

In Core

The GHC Haskell compiler, by far the most common, has an intermediate representation called "core" that looks similar to Haskell but with type classes and most syntactic sugar fully desugared. Core also has slightly different semantics than Haskell, especially with respect to its case construct. In core, case always causes evaluation of the top-level constructor. Patterns are rewritten with nesting to match the pattern semantics of the original Haskell code.

This section is all sort of an aside, but if you ever look into optimizing Haskell code you're going to run into core, and it's useful to know that there is a difference.

CodePudding user response：

My conclusion is that it must be due to lazy evaluation within Haskell. And that because the _ matches on anything it will not check what it compares to?

Yes, exactly!

During pattern matching evaluation is performed only as much as needed to match the constructors in the pattern (and numeric/char literals). If the pattern is a variable, or the wildcard _, no evaluation is needed.

case undefined of _ -> ()  -- OK
case undefined of True -> () -- crash
case undefined of (_,_) -> () -- crash
case (undefined, 4) of (_,x) -> x 1 -- OK

Also, pattern matching happens left-to-right, top-to-bottom:

case (4, undefined) of
   (3, True) -> 1
   (4, _)    -> 2

evaluates to 2 -- we first match 4 against 3 and that fails before we attempt to match undefined against True. In this way we avoid a crash and the second pattern (4, _) succeeds.

(To be exhaustive: there actually is an exception to this. If you pattern match against a newtype constructor, no evaluation happens. You can neglect this case for the moment, since you are learning -- it is not that common.)