Are codatatypes really terminal algebras?-CodePudding

(Disclaimer: I'm not 100% sure how codatatype works, especially when not referring to terminal algebras).

Consider the "category of types", something like Hask but with whatever adjustment that fits the discussion. Within such a category, it is said that (1) the initial algebras define datatypes, and (2) terminal algebras define codatatypes.

I'm struggling to convince myself of (2).

Consider the functor T(t) = 1 a * t. I agree that the initial T-algebra is well-defined and indeed defines [a], the list of a. By definition, the initial T-algebra is a type X together with a function f :: 1 a*X -> X, such that for any other type Y and function g :: 1 a*Y -> Y, there is exactly one function m :: X -> Y such that m . f = g . T(m) (where . denotes the function combination operator as in Haskell). With f interpreted as the list constructor(s), g the initial value and the step function, and T(m) the recursion operation, the equation essentially asserts the unique existance of the function m given any initial value and any step function defined in g, which necessitates an underlying well-behaved fold together with the underlying type, the list of a.

For example, g :: Unit (a, Nat) -> Nat could be () -> 0 | (_,n) -> n 1, in which case m defines the length function, or g could be () -> 0 | (_,n) -> 0, then m defines a constant zero function. An important fact here is that, for whatever g, m can always be uniquely defined, just as fold does not impose any contraint on its arguments and always produce a unique well-defined result.

This does not seem to hold for terminal algebras.

Consider the same functor T defined above. The definition of the terminal T-algebra is the same as the initial one, except that m is now of type X -> Y and the equation now becomes m . g = f . T(m). It is said that this should define a potentially infinite list.

I agree that this is sometimes true. For example, when g :: Unit (Unit, Int) -> Int is defined as () -> 0 | (_,n) -> n 1 like before, m then behaves such that m(0) = () and m(n 1) = Cons () m(n). For non-negative n, m(n) should be a finite list of units. For any negative n, m(n) should be of infinite length. It can be verified that the equation above holds for such g and m.

With any of the two following modified definition of g, however, I don't see any well-defined m anymore.

First, when g is again () -> 0 | (_,n) -> n 1 but is of type g :: Unit (Bool, Int) -> Int, m must satisfy that m(g((b,i))) = Cons b m(g(i)), which means that the result depends on b. But this is impossible, because m(g((b,i))) is really just m(i 1) which has no mentioning of b whatsoever, so the equation is not well-defined.

Second, when g is again of type g :: Unit (Unit, Int) -> Int but is defined as the constant zero function g _ = 0, m must satisfy that m(g(())) = Nil and m(g(((),i))) = Cons () m(g(i)), which are contradictory because their left hand sides are the same, both being m(0), while the right hand sides are never the same.

In summary, there are T-algebras that have no morphism into the supposed terminal T-algebra, which implies that the terminal T-algebra does not exist. The theoretical modeling of the codatatype Stream (or infinite list), if any, cannot be based on the nonexistant terminal algebra of the functor T(t) = 1 a * t.

Many thanks to any hint of any flaw in the story above.

CodePudding user response：

(2) terminal algebras define codatatypes.

This is not right, codatatypes are terminal coalgebras. For your T functor, a coalgebra is a type x together with f :: x -> T x. A T-coalgebra morphism between (x1, f1) and (x2, f2) is a g :: x1 -> x2 such that fmap g . f1 = f2 . g. Using this definition, the terminal T-algebra defines the possibly infinite lists (so-called "colists"), and the terminality is witnessed by the unfold function:

unfold :: (x -> Unit   (a, x)) -> x -> Colist a

Note though that a terminal T-algebra does exist: it is simply the Unit type together with the constant function T Unit -> Unit (and this works as a terminal algebra for any T). But this is not very interesting for writing programs.

CodePudding user response：

it is said that (1) the initial algebras define datatypes, and (2) terminal algebras define codatatypes.

On the second point, it is actually said that terminal coalgebras define codatatypes.

A datatype t is defined by its constructors and a fold.

Constructors can be modelled by an algebra F t -> t (for example, the Peano constructors O : nat S : Nat -> Nat are collected as a single function in : Unit Nat -> Nat).
The fold then gives the catamorphism fold f : t -> x for any algebra f : F x -> x (for nats, fold : ((Unit x) -> x) -> Nat -> x).

A codatatype t is defined by its destructors and an unfold.

Destructors can be modelled by a coalgebra t -> F t (for example, streams have two destructors head : Stream a -> a and tail : Stream a -> Stream a, and they are collected as a single function out : Stream a -> a * Stream a).
The unfold then gives the anamorphism unfold f : x -> t for any coalgebra f : x -> F x (for streams, unfold : (x -> a * x) -> x -> Stream a).

CodePudding user response：

(Disclaimer: I'm not 100% sure how codatatype works, especially when not referring to terminal algebras).

A codata type, or coinductive data type, is just one defined by its eliminations rather than its introductions.

It seems that sometimes terminal algebra is used (very confusingly) to refer to a final coalgebra, which is what actually defines a codata type.

Consider the same functor T defined above. The definition of the terminal T-algebra is the same as the initial one, except that m is now of type X -> Y and the equation now becomes m . g = f . T(m). It is said that this should define a potentially infinite list.

So I think this is where you’ve gone wrong: “m ∘ g = f ∘ T(m)” should be reversed, and read “T(m) ∘ f = g ∘ m”. That is, the final coalgebra is defined by a carrier set S and a map g : S → T(S) such that for any other coalgebra (R, f : R → T(R)) there is a unique map m : R → S such that T(m) ∘ f = g ∘ m.

m is uniquely defined recursively by the map that returns Left () whenever f maps to Left (), and Right (x, m xs) whenever f maps to Right (x, xs), i.e. it’s the assignment of the coalgebra to its unique morphism to the final coalgebra, and denotes the unique anamorphism/unfold of this type, which should be easy to convince yourself is in fact a possibly-empty & possibly-infinite stream.