Are the real Type of Polymorphic Types in Haskell implicitly universally quantified Type?-CodePudding

I'm trying to clarify some Haskell concepts, and wonder if anyone can help.

When defining a type trough data declaration such as:

data DatumType a b = Datum a b

I have been wondering if the type is DatumType or implicitly forall a. DatumType a b.

Because indeed with ExplicitForAll

λ> :set -XExplicitForAll
λ> :k forall a. Maybe a
forall a. Maybe a :: *

λ> :k forall a b . DatumType a b
forall a b . DatumType a b :: *

λ> :k DatumType
DatumType :: * -> * -> *

We can see that the kind forall a b. DatumType a b is proper type i.e. *

DatumType however in itself is just a Type Constructor.

Hence this would mean that polymorphic type such as [a] are TYPE (i.e. *) and that the full name of the type is forall a. [a]

In other word, what i am after is to confirm that type declaration in haskell for polymorphic type are always in an abbreviated form, and that the full type name is always with an implicit forall a. ....

That is, in data Mytype a the type is not Mytype neither just Mytype a, but forall a. Mytype a, while Mytype is a type constructor.

Is that interpretation correct ?

CodePudding user response：

The definition data DatumType a b = Datum a b translated into English is something like this:

Let there be a new type constructor named DatumType and a new data constructor named Datum such that for all types a and b, Datam applied to a value of type a and a value of type b results in a value of type DatumType a b.

I have been wondering if the type is DatumType or implicitly forall a. DatumType a b.

I think the barrier to understanding here is that the term "type" can have two different meanings, depending on context:

A type is any entity that exists at type-level
A type is specifically those type-level entities that can serve as the type of terms; anything in the kind Type (which is also called *), and only such things

These two definitions basically arise because we use the phrase "type of" to describe the relationship between things on the term level and things on the type level - so it seems odd to include things that can't participate in that relationship in the meaning of "type". And yet there are plenty of things that exist in type expressions that cannot participate in that relationship directly, we want to talk about them without having to use clunky phrases like "type level entity" all the time. As a community, we don't seem to have come up with a better solution than just using "type" in both of those senses at different times, and hopefully being clear from the context.

So it's clear that DatumType is not a type by definition 2; it needs to be applied to two type parameters to result in a type of something at the term level. If that's what we're thinking then I can see why you would lean towards thinking "the type" defined by the data declaration is forall a b. DatumType a b.

However I think that's probably a confusing way to think about it. Taken on its own forall a b. DatumType a b is the type of terms that fit in all types of the form DatumType _ _; the only values in this type are things like Datum undefined undefined. Rather the forall a b will normally be further out when you see DatumType a b; for example the type of the Datum constructor itself (which is what you'd write if you were defining it using GADT syntax) can be written as Datum :: forall a b. a -> b -> DatumType a b.

So saying "the type defined by the data declaration is forall a b. DatumType a b is pretty weird. The data declaration defines a whole family of types, of which the type forall a b. DatumType a b is one extremely limiting (and fairly useless) example. Rather I would say "for all types a and b, the data declaration defines a type DatumType a b"; the scope of the "for all" concept involved is wider than you can express with a forall keyword in a single type expression, so I moved it out of the formal type expression to the English description.

Of course if we focus on definition 1 of "type", then things are much simpler to explain. The data declaration simply defines a type DatumType with kind * -> * -> *.

This invites the question: which view is correct?

Perhaps unsatisfyingly, there is nothing interesting going on between these two viewpoints. Both of those things happen: we get a new type constructor DatumType that has kind * -> * -> *, and for all types a and b DatumType a b is a type of some terms. That's what a thing of kind * -> * -> * means. The dilemma only arises when you try to identify "the type" without realising that the term type is frequently used in both these two senses, so if you're not clear with yourself on which sense you're using you can trip up.

Some contexts say that things which don't have kind * are not types, and therefore DatumType is not really a type. Others talk about DatumType as a type. However that is purely a difference in what terminology we choose to use; either way DatumType :: * -> * -> * is a thing that exists, and it functions the same in either viewpoint. Getting hung up on whether a particular type-level entity is or is not a type is not a question of how things really work, it's just a question of English language usage.

So to come back to some of your more specific questions:

Hence this would mean that polymorphic type such as [a] are TYPE (i.e. *) and that the full name of the type is forall a. [a]

In other word, what i am after is to confirm that type declaration in haskell for polymorphic type are always in an abbreviated form, and that the full type name is always with an implicit forall a. ....

As I mentioned, I think this is not quite correct. Yes, [] at the type level needs a parameter to result in a type-of-terms; [] :: * -> *. So by definition 2, [] is not a type. However it's not really right to say that the full list type is "really" forall a. [a], as that is a specific type expression for universally polymorphic lists (which can only be the type of empty lists, or of lists full of bottom values, or a list that is the bottom value). In particular, [True, False] is a member of the type [Bool], and is not a member of the type forall a. [a].

Rather [] is just a type-level-entity of kind * -> *. whether you call that a type or not is up to you. Either way it is not the type of any terms, and either way it is a valid thing you can use in type expressions (provided they kind-check).

That is, in data Mytype a the type is not Mytype neither just Mytype a, but forall a. Mytype a, while Mytype is a type constructor.

It is true that Mytype is a type constructor. (In either definition of the word "type"; even if we're happy to call all type-level entities types, "type constructor" is still a specific category, and it's a good term to have because whether something is a "type constructor" cannot be determined just by looking at its kind, in the same way that you can't determine whether something is a data constructor just by looking at its type)

It is true that Mytype a on its own is not meaningful; it has to be used in a scope where a type variable a is in scope.

It is not true that forall a. Mytype a is the right way to think of "the type defined by data Mytype a. For any type a Mytype a is a type; forall a. Mytype a is a specific type expression that does not include most of the types we can form with the Mytype type constructor.

So any time you want to use Mytype a there will be a forall somewhere (either explicit or implicit). But it will usually be further out than immediately enclosing Mytype a; something like someFunc :: forall a. a -> Bool -> MyType a.

The OP left a comment on the other answer which I think it will be insightful to address:

I think I will edit the question, to explain where the all thingy of clarification is coming from. Let’s just say that while learning and experimenting with type classes, I was confused about instance Eq (a,b). Why not instance Eq (,), and the answer is not just because one need to refer to the type variable in the implementation. Eq can only be defined on a proper type, and any possible pair is actually a type, a polymorphic type. Then came the question, but what are we defining when we write data (,) a b = (,) a b . Am I defining (,) or forall a b. (a,b).

Keep in mind that making a type an instance of Eq is intended to mean that you can compare values of that type for equality.

When you say instance Eq (a, b) it really means instance forall a b. Eq (a, b); in English "for all types a and b, the type (a, b) is a member of the class Eq. It means you can pick any two types you like, apply the (,) type constructor to them, and the resulting type will be something use values you can compare for equality. (This is of course not actually true, the instance we really use is more like instance forall a b. (Eq a, Eq b) => Eq (a, b), because we need Eq a and Eq b)

instance Eq (,) is wrong because it is saying "the type constructor (,) is a member of the class Eq". "Values of (,) can be compared for equality" makes no sense, as (,) is not the kind of thing that can have values.

But (,) is still a thing that exists! And it's straightforwardly the thing defined by data (,) a b = (,) a b. It's just a thing with kind * -> * -> *, which is not the kind of thing we can make an Eq instance for. The phrase "Eq can only be defined on a proper type" is using the definition of "type" that only includes things with kind *, but remember "not being a proper type" by this definition does not mean "isn't a real thing", it just means we're choosing (in that particular sentence, at least) to use the word "type" with a narrow meaning. The class Eq is defined to have as members things with kind *, that's all. Other classes can be defined to have things with different kinds as members; if they have members of kind * -> * -> * then (,) on its own can be an instance of those classes. Have a look at the list of instances for Bifunctor, for example. The fact that the Eq instance has to be Eq (a, b) and not Eq (,) does not imply that (,) isn't "really" a valid entity that exists at the type level.

CodePudding user response：

The declaration

data DatumType a b = Datum a b

declares one new type, DatumType, of kind * -> * -> *. It also declares one new term, Datum, of type forall a b. a -> b -> DatumType a b.

Under normal circumstances, it is true that type signatures implicitly universally quantify all the type variables they mention at the top level. But a data declaration is not a type signature, so some parts of your question don't make sense.

It is also possible, with extensions turned on, for a type variable to be monomorphic, not universally quantified. For example,

{-# LANGUAGE ScopedTypeVariables #-}

foo :: forall a. Monoid a => a -> a
foo x = mconcat xs where
    xs :: [a]
    xs = [x, mempty, x]

In this code snippet, all quantifiers have been made explicit. Inside the where block, xs has the monomorphic type [a], where a is a type variable bound by an enclosing typing scope, and if you inserted an extra quantifier the code would not compile:

foo :: forall a. Monoid a => a -> a
foo x = mconcat xs where
    xs :: forall a. [a]
    xs = [x, mempty, x]
    -- type error: can't match type a and type a0, whoops

With this information, hopefully the truth status of each of your statements is more clear, but let's walk through them one at a time:

I have been wondering if the type is DatumType or implicitly forall a. DatumType a b.

Unanswerable. The type of what? Both DatumType and forall a. DatumType a b are legal type-level expressions.

This would mean that polymorphic type such as [a] are TYPE and that the full name of the type is forall a. [a].

Sometimes. The insertion of quantifiers is not a local process, but global to an entire type signature. They are implicitly floated up to the top level. So, these are two different types:

-- these are all the same with ScopedTypeVariables off
forall a. [a] -> ()
forall a. ([a] -> ())
[a] -> ()

-- this is a different type from the ones above
(forall a. [a]) -> ()

So if, in normal, un-extended Haskell, you wrote the type signature x :: [a], then [a] would indeed be shorthand for forall a. [a]; but if you wrote the signature f :: [a] -> (), then [a] would not be shorthand for forall a. [a], because the forall has to be floated out from being on an argument to the arrow to being on the entire type.

In other word, what i am after is to confirm that type declaration in haskell for polymorphic type are always in an abbreviated form, and that the full type name is always with an implicit forall a. ....

Yes.

That is, in data Mytype a the type is not Mytype neither just Mytype a, but forall a. Mytype a, ...

Category error. data Mytype a is not a type signature, so there is no the type. If I were to single out a type to call "the type" from such a declaration, it would be Mytype alone, as that is the only type that gets created by the declaration.

...while Mytype is a type constructor.

Yes.