Home > OS >  Are codatatypes really terminal algebras?
Are codatatypes really terminal algebras?

Time:11-27

(Disclaimer: I'm not 100% sure how codatatype works, especially when not referring to terminal algebras).

Consider the "category of types", something like Hask but with whatever adjustment that fits the discussion. Within such a category, it is said that (1) the initial algebras define datatypes, and (2) terminal algebras define codatatypes.

I'm struggling to convince myself of (2).

Consider the functor T(t) = 1 a * t. I agree that the initial T-algebra is well-defined and indeed defines [a], the list of a. By definition, the initial T-algebra is a type X together with a function f :: 1 a*X -> X, such that for any other type Y and function g :: 1 a*Y -> Y, there is exactly one function m :: X -> Y such that m . f = g . T(m) (where . denotes the function combination operator as in Haskell). With f interpreted as the list constructor(s), g the initial value and the step function, and T(m) the recursion operation, the equation essentially asserts the unique existance of the function m given any initial value and any step function defined in g, which necessitates an underlying well-behaved fold together with the underlying type, the list of a.

For example, g :: Unit (a, Nat) -> Nat could be () -> 0 | (_,n) -> n 1, in which case m defines the length function, or g could be () -> 0 | (_,n) -> 0, then m defines a constant zero function. An important fact here is that, for whatever g, m can always be uniquely defined, just as fold does not impose any contraint on its arguments and always produce a unique well-defined result.

This does not seem to hold for terminal algebras.

Consider the same functor T defined above. The definition of the terminal T-algebra is the same as the initial one, except that m is now of type X -> Y and the equation now becomes m . g = f . T(m). It is said that this should define a potentially infinite list.

I agree that this is sometimes true. For example, when g :: Unit (Unit, Int) -> Int is defined as () -> 0 | (_,n) -> n 1 like before, m then behaves such that m(0) = () and m(n 1) = Cons () m(n). For non-negative n, m(n) should be a finite list of units. For any negative n, m(n) should be of infinite length. It can be verified that the equation above holds for such g and m.

With any of the two following modified definition of g, however, I don't see any well-defined m anymore.

First, when g is again () -> 0 | (_,n) -> n 1 but is of type g :: Unit (Bool, Int) -> Int, m must satisfy that m(g((b,i))) = Cons b m(g(i)), which means that the result depends on b. But this is impossible, because m(g((b,i))) is really just m(i 1) which has no mentioning of b whatsoever, so the equation is not well-defined.

Second, when g is again of type g :: Unit (Unit, Int) -> Int but is defined as the constant zero function g _ = 0, m must satisfy that m(g(())) = Nil and m(g(((),i))) = Cons () m(g(i)), which are contradictory because their left hand sides are the same, both being m(0), while the right hand sides are never the same.

In summary, there are T-algebras that have no morphism into the supposed terminal T-algebra, which implies that the terminal T-algebra does not exist. The theoretical modeling of the codatatype Stream (or infinite list), if any, cannot be based on the nonexistant terminal algebra of the functor T(t) = 1 a * t.

Many thanks to any hint of any flaw in the story above.

CodePudding user response:

(2) terminal algebras define codatatypes.

This is not right, codatatypes are terminal coalgebras. For your T functor, a coalgebra is a type x together with f :: x -> T x. A T-coalgebra morphism between (x1, f1) and (x2, f2) is a g :: x1 -> x2 such that fmap g . f1 = f2 . g. Using this definition, the terminal T-algebra defines the possibly infinite lists (so-called "colists"), and the terminality is witnessed by the unfold function:

unfold :: (x -> Unit   (a, x)) -> x -> Colist a

Note though that a terminal T-algebra does exist: it is simply the Unit type together with the constant function T Unit -> Unit (and this works as a terminal algebra for any T). But this is not very interesting for writing programs.

CodePudding user response:

it is said that (1) the initial algebras define datatypes, and (2) terminal algebras define codatatypes.

On the second point, it is actually said that terminal coalgebras define codatatypes.

A datatype t is defined by its constructors and a fold.

  • Constructors can be modelled by an algebra F t -> t (for example, the Peano constructors O : nat S : Nat -> Nat are collected as a single function in : Unit Nat -> Nat).
  • The fold then gives the catamorphism fold f : t -> x for any algebra f : F x -> x (for nats, fold : ((Unit x) -> x) -> Nat -> x).

A codatatype t is defined by its destructors and an unfold.

  • Destructors can be modelled by a coalgebra t -> F t (for example, streams have two destructors head : Stream a -> a and tail : Stream a -> Stream a, and they are collected as a single function out : Stream a -> a * Stream a).
  • The unfold then gives the anamorphism unfold f : x -> t for any coalgebra f : x -> F x (for streams, unfold : (x -> a * x) -> x -> Stream a).

CodePudding user response:

(Disclaimer: I'm not 100% sure how codatatype works, especially when not referring to terminal algebras).

A codata type, or coinductive data type, is just one defined by its eliminations rather than its introductions.

It seems that sometimes terminal algebra is used (very confusingly) to refer to a final coalgebra, which is what actually defines a codata type.

Consider the same functor T defined above. The definition of the terminal T-algebra is the same as the initial one, except that m is now of type X -> Y and the equation now becomes m . g = f . T(m). It is said that this should define a potentially infinite list.

So I think this is where you’ve gone wrong: “mg = fT(m)” should be reversed, and read “T(m) ∘ f = gm”. That is, the final coalgebra is defined by a carrier set S and a map g : ST(S) such that for any other coalgebra (R, f : RT(R)) there is a unique map m : RS such that T(m) ∘ f = gm.

m is uniquely defined recursively by the map that returns Left () whenever f maps to Left (), and Right (x, m xs) whenever f maps to Right (x, xs), i.e. it’s the assignment of the coalgebra to its unique morphism to the final coalgebra, and denotes the unique anamorphism/unfold of this type, which should be easy to convince yourself is in fact a possibly-empty & possibly-infinite stream.

  • Related