Home > Software engineering >  How do you design similar types to enforce particular rules?
How do you design similar types to enforce particular rules?

Time:05-18

Suppose you have a naive vector type, such as

data Vector a
  = V2 a a
  | V3 a a a

I am interested in how you can — and how you should — implement particular logic specific to the type's constructor. What I mean can be clarified with an example.

Suppose I want to implement:

instance Num a => Num (Vector a) where
  ( ) = 
  (*) = _
  abs = _
  signum = _
  fromInteger = _
  negate = _

Due to pattern matching, I can very easily match V2 with V2. However, what if I want it to be invalid for a particular combination, such as between V2 and V3?

In this case, is it required to make individual types for V2and V3? Then, however, do I not lose the polymorphic properties they currently have, being the same type? For example, if I begin building operations consuming my Vector, I currently need only one function. However, if I split into two types, I need two distinct functions.

How can I solve this problem by thinking in types?

CodePudding user response:

So you want your ADT cases to look like distinct types sometimes, so they can't be accidentally mixed, yet other times you want to be able to treat them homogeneously? That's what GADTs are for (among other things).

Give your type a phantom type parameter, and force it to be different for the different constructors. When you don't care, you can just make that type parameter generic:

data Len2
data Len3

data Vector len a where
  V2 :: a -> a -> Vector Len2 a
  V3 :: a -> a -> a -> Vector Len3 a

vectorAdd :: Num a => Vector len a -> Vector len a -> Vector len a
vectorAdd (V2 x1 y1) (V2 x2 y2) = V2 (x1 x2) (y1 y2)
vectorAdd (V3 x1 y1 z1) (V3 x2 y2 z2) = V3 (x1 x2) (y1 y2) (z1 z2)

vectorFst :: Vector len a -> a
vectorFst (V2 x _) = x
vectorFst (V3 x _ _) = x

In vectorAdd the compiler recognizes that V2 and V3 can never be matched because they always have different len parameters, yet you still can write vectorFst that works for any len.

Next level - don't create special-purpose Len2 and Len3 types, use numeric type-level literals, which GHC supports:

data Vector len a where
  V2 :: a -> a -> Vector 2 a
  V3 :: a -> a -> a -> Vector 3 a

CodePudding user response:

(Because you probably don't want to make Vector a Num, I'll be referring to monad operations instead.)

Haskell's [] monad solves this problem by simply going through every possible combination of elements. For example, [( 1),(*2)] <*> [3,6,5] = [4,7,6, 6,12,10]. For your Vector implementation which only supports two or three elements, though, this won't work. Would V2 ( 1) (*2) <*> V3 3 6 5 have two or three elements? For this reason, you probably want to expand your type to support all lengths of lists, which would just bring you right back to [].

If you want to maintain this "zipping" operation, though... things get more complicated. Possible, but complicated. The basic idea is to embed the length of the vector into the type while still maintaining polymorphism. We can essentially do this with type-level natural numbers...

data N = Z | S N

(Z stands for "zero" and S for "successor", e.g. S (S (S (S Z))) = 4)

...and we can utilize this with data kinds and GADTs:

data Vector n a where
    Nil :: Vector 'Z a
    (:^) :: a -> Vector n a -> Vector ('S n) a

Here, n is the length of the vector. For example, V2 would be represented by Vector (S (S Z)) a and V3 by Vector (S (S (S Z))) a. Notably, though, you can make the length polymorphic, e.g. Vector (S n) a is the type of all non-empty vectors (length is at least one). Unfortunately, (GHC's) type inference breaks down in the presence of GADTs, so you'll probably need scoped type variables (and a lot of explicit type signatures) to do anything useful.

All the actual data definition is saying is that Nil has length zero, and an element prepended to a vector (x :^ xs) has length one more than the vector given.

As an example, here's a Functor instance, defining fmap analogously to []'s map:

instance Functor (Vector n) where
    fmap :: forall a b. (a -> b) -> Vector n a -> Vector n b
    fmap f = go where
        go :: Vector k a -> Vector k b
        go Nil = Nil
        go (x:^xs) = f x :^ go xs

Notice all of the type signatures littered around. Unfortunately, this is almost always necessary because GHC isn't too keen to assume polymorphism in the presence of GADTs. The forall here helps scope a and b so that they can be used in the type for go. If it wasn't there, then the a and b in fmap and the a and b in go would be completely separate type variables, just with similar names.

To define more interesting functions, GHC needs to know that we want to interact with the length of the list, so we define a singleton type which "refines" the length...

data P n where
    PZ :: P 'Z
    PS :: P n -> P ('S n)

(read P n as "a proof of n's natural value)

...and a class which gives access to that type:

class Nat n where
    nat :: P n
instance Nat 'Z where
    nat = PZ
instance Nat n => Nat (S n) where
    nat = PS nat

From here, we can define replicateV analogously to []'s replicate:

replicateV :: forall n a. Nat n => a -> Vector n a
replicateV x = go nat where
    -- read as "go takes a proof of k's value and returns a vector of length k"
    go :: P k -> Vector k a
    go PZ = Nil
    go (PS p) = x :^ go p

From here, you can define a vzip :: (a -> b -> c) -> Vector n a -> Vector n b -> Vector n c similar to []'s zip.

(See also this question which expands on these ideas in a slightly different way.)

  • Related