Home > Back-end >  How to create a sum type of limited 1-arity data constructors
How to create a sum type of limited 1-arity data constructors

Time:07-14

Let's say I'm trying to create a Haskell sum data type to represent clothing stores on three separate streets, Grant, Lee, and Lincoln Streets. Grant St. has two stores, Lee St. has one store, and Lincoln St. has two store. I could just have a data constructor-constant for each store, but what if I wanted three data constructors of arity-one to break the stores down to their streets. How then would I represent the individual stores and maintain what street they were on?

data TownClothesStores = Grant ? | Lee ? | Lincoln ?

It almost seems like I'm forced to create another data type with all the individual stores

data ClothesStores = Treschic | Muoychic | Verychic | Sehrchic | Mycketbra

rather than just use String or Int to represent the individual stores. Doing some sort of query should reveal what stores are on what streets. But can this be gleaned from just a data type declaration? Any Haskell data management lore appreciated.

CodePudding user response:

If I'm understanding your question correctly, you want a data type that you can query store names given a street name - probably in a "well typed" manner.

I think that the easiest, simplest solution is to simply define the sum types for each street and using that as the field - it feels natural to me, and it's really not that "boilerplat"-ey. I mean, the only difference between this and inline sumtypes is a few extra data keywords here and there and some newlines. But you get a much nicer API in return - effectively a self documenting one, expressing itself through the types.

data Street = Grant GrantStore | Lee LeeStore | Lincoln LincolnStore

data GrantStore = Treschic | Muoychic
data LeeStore = Verychic
data LincolnStore = Sehrchic | Mycketbra

I understand the feeling of "there must be a better way", as I've also felt similarly on this exact topic before - but in the long run, it never ends up being a big deal.

Now, if you really want something "cooler" - you could type your store names utilizing a Generalized Algebraic Data Type (GADT):

{-# LANGUAGE GADTs     #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE DataKinds #-}

data TownStreet = Grant | Lee | Lincoln

data ClothesStore street where
  Treschic :: ClothesStore Grant
  Muoychic :: ClothesStore Grant
  Verychic :: ClothesStore Lee
  Sehrchic :: ClothesStore Lincoln
  Mycketbra :: ClothesStore Lincoln

Now, you have a flattened sum type of store names, but each of them are tagged with the street name at the type level.

The type representing "any store in Grant street" would be ClothesStore Grant, pattern matching on it will reveal the exact store name - but it will only be one of those two:

f :: ClothesStore Grant -> String
f Treschic = "Treschic"
f Muoychic = "Muoychic"

Note: The compiler won't let you match on any other constructors here, as they can't be passed to f.

The issue with GADTs is that, while they look very promising initially - you'll quickly realize that you need many more advanced type level features to actually do fundamental things with them.

To illustrate, how do you get a list of all store names given an era? Granted, you couldn't do this easily even in languages with union types for brevity. However, the simple naive (allegedly boilerplate-y) solution can do this trivially, just derive Enum.

You'd need a typeclass to do this:

{-# LANGUAGE TypeApplications #-}

class StoresIn street where
  storesIn :: [ClothesStore street]

instance StoresIn Grant where
  storesIn = [Treschic, Muoychic]

instance StoresIn Lee where
  storesIn = [Verychic]

instance StoresIn Lincoln where
  storesIn = [Sehrchic, Mycketbra]

-- Usage example, get all the stores in Grant street
> storesIn @Grant
[Treschic, Muoychic] 

At the end of the day, you could use generic deriving, template haskell, and maybe even more advanced type level features to abstract and abstract further, to reduce this code.... by creating more code. But at that point, why not just use the simple solution? At some point, it'll become obvious that there isn't another way that's simpler - not in Haskell, not in any language.

  • Related