(Disclaimer: I'm not 100% sure how codatatype works, especially when not referring to terminal algebras).
Consider the "category of types", something like Hask but with whatever adjustment that fits the discussion. Within such a category, it is said that (1) the initial algebras define datatypes, and (2) terminal algebras define codatatypes.
I'm struggling to convince myself of (2).
Consider the functor T(t) = 1 a * t
. I agree that the initial T
-algebra is well-defined and indeed defines [a]
, the list of a
. By definition, the initial T
-algebra is a type X
together with a function f :: 1 a*X -> X
, such that for any other type Y
and function g :: 1 a*Y -> Y
, there is exactly one function m :: X -> Y
such that m . f = g . T(m)
(where .
denotes the function combination operator as in Haskell). With f
interpreted as the list constructor(s), g
the initial value and the step function, and T(m)
the recursion operation, the equation essentially asserts the unique existance of the function m
given any initial value and any step function defined in g
, which necessitates an underlying well-behaved fold
together with the underlying type, the list of a
.
For example, g :: Unit (a, Nat) -> Nat
could be () -> 0 | (_,n) -> n 1
, in which case m
defines the length function, or g
could be () -> 0 | (_,n) -> 0
, then m
defines a constant zero function. An important fact here is that, for whatever g
, m
can always be uniquely defined, just as fold
does not impose any contraint on its arguments and always produce a unique well-defined result.
This does not seem to hold for terminal algebras.
Consider the same functor T
defined above. The definition of the terminal T
-algebra is the same as the initial one, except that m
is now of type X -> Y
and the equation now becomes m . g = f . T(m)
. It is said that this should define a potentially infinite list.
I agree that this is sometimes true. For example, when g :: Unit (Unit, Int) -> Int
is defined as () -> 0 | (_,n) -> n 1
like before, m
then behaves such that m(0) = ()
and m(n 1) = Cons () m(n)
. For non-negative n
, m(n)
should be a finite list of units. For any negative n
, m(n)
should be of infinite length. It can be verified that the equation above holds for such g
and m
.
With any of the two following modified definition of g
, however, I don't see any well-defined m
anymore.
First, when g
is again () -> 0 | (_,n) -> n 1
but is of type g :: Unit (Bool, Int) -> Int
, m
must satisfy that m(g((b,i))) = Cons b m(g(i))
, which means that the result depends on b
. But this is impossible, because m(g((b,i)))
is really just m(i 1)
which has no mentioning of b
whatsoever, so the equation is not well-defined.
Second, when g
is again of type g :: Unit (Unit, Int) -> Int
but is defined as the constant zero function g _ = 0
, m
must satisfy that m(g(())) = Nil
and m(g(((),i))) = Cons () m(g(i))
, which are contradictory because their left hand sides are the same, both being m(0)
, while the right hand sides are never the same.
In summary, there are T
-algebras that have no morphism into the supposed terminal T
-algebra, which implies that the terminal T
-algebra does not exist. The theoretical modeling of the codatatype Stream (or infinite list), if any, cannot be based on the nonexistant terminal algebra of the functor T(t) = 1 a * t
.
Many thanks to any hint of any flaw in the story above.
CodePudding user response:
(2) terminal algebras define codatatypes.
This is not right, codatatypes are terminal coalgebras. For your T
functor, a coalgebra is a type x
together with f :: x -> T x
. A T
-coalgebra morphism between (x1, f1)
and (x2, f2)
is a g :: x1 -> x2
such that fmap g . f1 = f2 . g
. Using this definition, the terminal T
-algebra defines the possibly infinite lists (so-called "colists"), and the terminality is witnessed by the unfold
function:
unfold :: (x -> Unit (a, x)) -> x -> Colist a
Note though that a terminal T
-algebra does exist: it is simply the Unit
type together with the constant function T Unit -> Unit
(and this works as a terminal algebra for any T
). But this is not very interesting for writing programs.
CodePudding user response:
it is said that (1) the initial algebras define datatypes, and (2) terminal algebras define codatatypes.
On the second point, it is actually said that terminal coalgebras define codatatypes.
A datatype t
is defined by its constructors and a fold.
- Constructors can be modelled by an algebra
F t -> t
(for example, the Peano constructorsO : nat
S : Nat -> Nat
are collected as a single functionin : Unit Nat -> Nat
). - The fold then gives the catamorphism
fold f : t -> x
for any algebraf : F x -> x
(for nats,fold : ((Unit x) -> x) -> Nat -> x
).
A codatatype t
is defined by its destructors and an unfold.
- Destructors can be modelled by a coalgebra
t -> F t
(for example, streams have two destructorshead : Stream a -> a
andtail : Stream a -> Stream a
, and they are collected as a single functionout : Stream a -> a * Stream a
). - The unfold then gives the anamorphism
unfold f : x -> t
for any coalgebraf : x -> F x
(for streams,unfold : (x -> a * x) -> x -> Stream a
).
CodePudding user response:
(Disclaimer: I'm not 100% sure how codatatype works, especially when not referring to terminal algebras).
A codata type, or coinductive data type, is just one defined by its eliminations rather than its introductions.
It seems that sometimes terminal algebra is used (very confusingly) to refer to a final coalgebra, which is what actually defines a codata type.
Consider the same functor T defined above. The definition of the terminal T-algebra is the same as the initial one, except that m is now of type X -> Y and the equation now becomes m . g = f . T(m). It is said that this should define a potentially infinite list.
So I think this is where you’ve gone wrong: “m ∘ g = f ∘ T(m)” should be reversed, and read “T(m) ∘ f = g ∘ m”. That is, the final coalgebra is defined by a carrier set S and a map g : S → T(S) such that for any other coalgebra (R, f : R → T(R)) there is a unique map m : R → S such that T(m) ∘ f = g ∘ m.
m is uniquely defined recursively by the map that returns Left ()
whenever f maps to Left ()
, and Right (x, m xs)
whenever f maps to Right (x, xs)
, i.e. it’s the assignment of the coalgebra to its unique morphism to the final coalgebra, and denotes the unique anamorphism/unfold of this type, which should be easy to convince yourself is in fact a possibly-empty & possibly-infinite stream.