Home > Software design >  Why are values of type () inspected?
Why are values of type () inspected?

Time:08-27

In Haskell, the () type has two values, namely, () and bottom. If you have an expression e :: (), there's no point in actually inspecting it, since either it's e = () or by inspecting it you're crashing a program which could otherwise have not crashed.

Hence, I figured that perhaps operations on values of type () would not inspect the value and would not distinguish between () and bottom.

However, this is wildly untrue:

▎λ ghci
GHCi, version 9.0.2: https://www.haskell.org/ghc/  :? for help
ghci> u = (undefined :: ())
ghci> show u
"*** Exception: Prelude.undefined
CallStack (from HasCallStack):
  error, called at libraries/base/GHC/Err.hs:75:14 in base:GHC.Err
  undefined, called at <interactive>:1:6 in interactive:Ghci1
ghci> () == u
*** Exception: Prelude.undefined
CallStack (from HasCallStack):
  error, called at libraries/base/GHC/Err.hs:75:14 in base:GHC.Err
  undefined, called at <interactive>:1:6 in interactive:Ghci1
ghci> f () = "ok"
ghci> f u
"*** Exception: Prelude.undefined
CallStack (from HasCallStack):
  error, called at libraries/base/GHC/Err.hs:75:14 in base:GHC.Err
  undefined, called at <interactive>:1:6 in interactive:Ghci1

What is the reason for this? Here are some conjectures:

  1. For some reason that I can't think of, it's useful to be non-lazy on (). Sometimes we want that bottom to propagate.

  2. Haskell semantics are written in such a way that destructuring any ADTs, even trivial ones, inspects them. This means that having case (undefined :: ()) of { () -> ... } not throw would be a violation of language semantics

  3. () is an extremely special case and simply isn't worth the attention to eke out this tiny extra bit of safety in a massive language like Haskell

  4. There's also the possible combination explanation of 2 3, that Haskell could have had semantics dictating that an expression case e of inspects e unless it is of type (), but that would pollute the language spec for relatively low benefit

CodePudding user response:

I don't know for sure, but I suspect it's none of the things you said. Instead, this is so that the language is predictable and consistent.

There are, essentially, two things you observed, and I consider them to be separate things. The first is that checking whether a x is indeed () with a case statement forces evaluation of x; the second is that the instances (of Show and Eq) are written to use a case statement.

  • Pattern matching: the predictable, consistent rule here is that if you write case <e0> of <pat> -> <e1>, then e0 is evaluated far enough to check whether the constructors in pat are in fact in the given places. Well, okay, there's some wrinkles here to do with irrefutable patterns; let's say instead that e0 is evaluated far enough to check whether pat actually does match! For the () type, that means that the pattern () causes full evaluation -- because you've specified the full value that you expect it to be -- while the pattern x or _ can match without further evaluation.

  • Class instances: the natural inductive way to specify what the various class instances do is to always have an outermost case that matches against each available constructor with simple variable patterns for the fields, then does something (presumably recursive calls) on each of the fields in turn. That is, simplifying a bit, the show implementation goes like:

    show x = case x of
        <Con0> field00 field01 field02 <...> -> "<Con0>"
               " "    show field00
               " "    show field01
               " "    show field02
               <...>
        <Con1> field10 field11 field12 <...> -> "<Con1>"
               " "    show field10
               " "    show field11
               " "    show field12
               <...>
        <...>  
    

    It is very natural for the specialization of this scheme to the single-constructor, zero-field type () to go:

    show x = case x of
        () -> "()"
    

    (Additionally, the Report specifies that (==) is is always strict in both arguments; but that property would also arise naturally from the obvious way of writing a generic Eq instance derivation algorithm.) Therefore the path of least surprise is for class instances to pattern match on their argument(s).

CodePudding user response:

I will address this part:

For some reason that I can't think of, it's useful to be non-lazy on (). Sometimes we want that bottom to propagate.

Let's have a look at Control.Parallel.Strategies (version 1, an older version). This is a module for parallel evaluation. Let's focus on one of its functions for the sake of simplicity:

parMap :: Strategy b -> (a -> b) -> [a] -> [b]

The result of parMap strat f xs is the same as map f xs, except that the list is computed in parallel. What is the strat argument? Well,

strat :: Strategy b

means

strat :: b -> ()

There are only two things you can do with strat:

  • call it and ignore the result, which by laziness amounts to not calling it at all;
  • call it and force the result, even if you know it's () or a bottom.

parMap does the latter, in parallel. This allows the caller to specify a strat argument that evaluates the list values of type b as needed. For example

parMap (\(x,y) -> ()) f xs
parMap (\(x,y) -> x `seq` ()) f xs
parMap (\(x,y) -> x `seq` y `seq` ()) f xs

are valid calls, and will cause parMap to evaluate the new list-of-pairs only to expose the pair constructor, also the first component, also the second component, respectively.

Hence, forcing the () result of strat in this case allows the user to control how much evaluation to perform during parMap, i.e. how much to force the result (in parallel), and consequently which parts of the result should be left unevaluated. (By comparison map f xs would leave the result fully unevaluated -- it is completely lazy. parMap can not do that otherwise it is not longer parallel.)


Minor digression: note that the GADT

data a :~: b where
    Refl :: t :~: t

has one constructor like (). Here, it is mandatory that such values are forced as in:

foo :: Int :~: String -> Int -> String
foo Refl x = x    " hello"

Here the first argument must be a bottom. By forcing that, we make the function error out with an exception. If we did not force that, we would get a very nasty undefined behavior like those in C and C , completely breaking type safety. Haskell will correctly reject any attempt to circumvent that:

foo :: Int :~: String -> Int -> String
foo _ x = x    " hello"

triggers a type error at compile time.

  • Related