In Haskell, the ()
type has two values, namely, ()
and bottom. If you have an expression e :: ()
, there's no point in actually inspecting it, since either it's e = ()
or by inspecting it you're crashing a program which could otherwise have not crashed.
Hence, I figured that perhaps operations on values of type ()
would not inspect the value and would not distinguish between ()
and bottom.
However, this is wildly untrue:
▎λ ghci
GHCi, version 9.0.2: https://www.haskell.org/ghc/ :? for help
ghci> u = (undefined :: ())
ghci> show u
"*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:75:14 in base:GHC.Err
undefined, called at <interactive>:1:6 in interactive:Ghci1
ghci> () == u
*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:75:14 in base:GHC.Err
undefined, called at <interactive>:1:6 in interactive:Ghci1
ghci> f () = "ok"
ghci> f u
"*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:75:14 in base:GHC.Err
undefined, called at <interactive>:1:6 in interactive:Ghci1
What is the reason for this? Here are some conjectures:
For some reason that I can't think of, it's useful to be non-lazy on
()
. Sometimes we want that bottom to propagate.Haskell semantics are written in such a way that destructuring any ADTs, even trivial ones, inspects them. This means that having
case (undefined :: ()) of { () -> ... }
not throw would be a violation of language semantics()
is an extremely special case and simply isn't worth the attention to eke out this tiny extra bit of safety in a massive language like HaskellThere's also the possible combination explanation of 2 3, that Haskell could have had semantics dictating that an expression
case e of
inspectse
unless it is of type()
, but that would pollute the language spec for relatively low benefit
CodePudding user response:
I don't know for sure, but I suspect it's none of the things you said. Instead, this is so that the language is predictable and consistent.
There are, essentially, two things you observed, and I consider them to be separate things. The first is that checking whether a x
is indeed ()
with a case
statement forces evaluation of x
; the second is that the instances (of Show
and Eq
) are written to use a case
statement.
Pattern matching: the predictable, consistent rule here is that if you write
case <e0> of <pat> -> <e1>
, thene0
is evaluated far enough to check whether the constructors inpat
are in fact in the given places. Well, okay, there's some wrinkles here to do with irrefutable patterns; let's say instead thate0
is evaluated far enough to check whetherpat
actually does match! For the()
type, that means that the pattern()
causes full evaluation -- because you've specified the full value that you expect it to be -- while the patternx
or_
can match without further evaluation.Class instances: the natural inductive way to specify what the various class instances do is to always have an outermost
case
that matches against each available constructor with simple variable patterns for the fields, then does something (presumably recursive calls) on each of the fields in turn. That is, simplifying a bit, theshow
implementation goes like:show x = case x of <Con0> field00 field01 field02 <...> -> "<Con0>" " " show field00 " " show field01 " " show field02 <...> <Con1> field10 field11 field12 <...> -> "<Con1>" " " show field10 " " show field11 " " show field12 <...> <...>
It is very natural for the specialization of this scheme to the single-constructor, zero-field type
()
to go:show x = case x of () -> "()"
(Additionally, the Report specifies that
(==)
is is always strict in both arguments; but that property would also arise naturally from the obvious way of writing a genericEq
instance derivation algorithm.) Therefore the path of least surprise is for class instances to pattern match on their argument(s).
CodePudding user response:
I will address this part:
For some reason that I can't think of, it's useful to be non-lazy on
()
. Sometimes we want that bottom to propagate.
Let's have a look at Control.Parallel.Strategies
(version 1, an older version). This is a module for parallel evaluation. Let's focus on one of its functions for the sake of simplicity:
parMap :: Strategy b -> (a -> b) -> [a] -> [b]
The result of parMap strat f xs
is the same as map f xs
, except that the list is computed in parallel. What is the strat
argument? Well,
strat :: Strategy b
means
strat :: b -> ()
There are only two things you can do with strat
:
- call it and ignore the result, which by laziness amounts to not calling it at all;
- call it and force the result, even if you know it's
()
or a bottom.
parMap
does the latter, in parallel. This allows the caller to specify a strat
argument that evaluates the list values of type b
as needed. For example
parMap (\(x,y) -> ()) f xs
parMap (\(x,y) -> x `seq` ()) f xs
parMap (\(x,y) -> x `seq` y `seq` ()) f xs
are valid calls, and will cause parMap
to evaluate the new list-of-pairs only to expose the pair constructor, also the first component, also the second component, respectively.
Hence, forcing the ()
result of strat
in this case allows the user to control how much evaluation to perform during parMap
, i.e. how much to force the result (in parallel), and consequently which parts of the result should be left unevaluated. (By comparison map f xs
would leave the result fully unevaluated -- it is completely lazy. parMap
can not do that otherwise it is not longer parallel.)
Minor digression: note that the GADT
data a :~: b where
Refl :: t :~: t
has one constructor like ()
. Here, it is mandatory that such values are forced as in:
foo :: Int :~: String -> Int -> String
foo Refl x = x " hello"
Here the first argument must be a bottom. By forcing that, we make the function error out with an exception. If we did not force that, we would get a very nasty undefined behavior like those in C and C , completely breaking type safety. Haskell will correctly reject any attempt to circumvent that:
foo :: Int :~: String -> Int -> String
foo _ x = x " hello"
triggers a type error at compile time.