I'm trying to understand polymorphism in haskell. Given the typical example below
module Main where
data Dog = Dog
data Cat = Cat
class Animal a where
speak :: a -> String
getA :: a
instance Animal Dog where
speak _ = "Woof"
getA = Dog
instance Animal Cat where
speak _ = "Meow"
getA = Cat
doA animal = do
putStrLn $ speak animal
main :: IO ()
main = do
doA Dog
doA Cat
doA (getA :: Dog)
I have the getA
function which is part of the Animal
typeclass and it works as expected. I can use getA
as long as I provide the type annotation like read
.
However when I try to define a standalone function like below, it doesn't compile. Why is this an error?
getA' :: Animal a => a
getA' = if True then Dog else Cat
Why does the independent function getA'
not work while getA
does?
CodePudding user response:
This is a very common mistake: overlooking the direction of the polymorphism.
In short: it's the caller of the function that gets to choose type parameters, not the implementer of the function.
Slightly longer: when you give your function a signature like Animal a => a
, you're making a promise to any caller of your function, and that promise reads something like this: "Pick a type. Any type. Whatever type you want. Let's call it a
. Now make sure there is an instance Animal a
. Now I can return you a value of type a
"
So you see, when you write such function, you don't get to return a specific type that you choose. You have to return whatever type the caller of your function will choose later when they call it.
To drive it home with a specific example, imagine that your getA'
function is possible, and then consider this code:
data Giraffe = Giraffe
instance Animal Giraffe where
speak _ = "Huh?"
getA = Giraffe
myGiraffe :: Giraffe
myGiraffe = getA' -- does this work? how?
With a type class method this works, because it's not the same function that the caller is calling. It's two different functions, one for Dog
and another for Cat
, that just happen to share the same name.
When the caller gets around to calling one of these functions, they need to somehow choose which one. This can be done in two ways: either (1) they know the exact type they want, and then the compiler can look up the corresponding function for that type, or (2) somebody else has somehow passed an Animal
instance to them, and it's that instance that contains a reference to the function.
Now, if what you really wanted to do was to create a system where there can be a limited number of animals (i.e. just Cat
and Dog
), and the getA'
function would return one of them, depending on reasons, then what you're looking for is not a type class, but just an ADT, like this:
data Animal = Cat | Dog
speak :: Animal -> String
speak Cat = "Meow"
speak Dog = "Woof"
getA' :: Animal
getA' = if True then Dog else Cat
Here, the function getA'
will work just fine, because both Cat
and Dog
are values of the same type Animal
. All types are always known, there is nothing generic.
Q: Ok, but this way, if I want to add Giraffe
, I can't do it later, in another module, I have to modify the Animal
type. Can't I have it both ways?
Short answer: no. This is a well-known problem, called "The Expression Problem", and the basic idea is that you can either have everything known upfront ("closed world"), or you get to add more things later ("open world"), but you can't have both at the same time. Duh!
But in Haskell, you still sorta can. But not really. This is a bit more advanced, so please ignore if it seems confusing.
What you can do is add another type, which will contain an animal value plus its Animal
instance. Both wrapped up in a box. It looks like this:
data SomeAnimal where
SomeAnimal :: Animal a => a -> SomeAnimal
Then you can construct values of this type by wrapping Cat
or Dog
:
aCat :: SomeAnimal
aCat = SomeAnimal Cat
aDog :: SomeAnimal
aDog = SomeAnimal Dog
Note that both aCat
and aDog
are of the same type SomeAnimal
. This is the key point. They're values of different types wrapped inside the box that looks the same from the outside, and the box also contains their respective Animal
instance.
And this means that, if you unbox the box, you get the value and its Animal
instance, which in turn means that you get to use the Animal
methods. For example:
someSpeak :: SomeAnimal -> String
someSpeak (SomeAnimal a) = speak a
And with this, you can implement your getA'
function this way:
getA' :: SomeAnimal
getA' = if True then SomeAnimal Dog else SomeAnimal Cat
However, you still get "The Expression Problem", because I actually lied a little: it's not about "closed world" vs. "open world", it's about extending the set of operations vs. extending the set of possible values. One will always be easy, and the other hard (read the link for details).
And this applies to this case too:
- if you make
Cat
andDog
values of the same type, you get to easily add more functions, but if you want to add more animals, you have to find all those functions you already made and modify them. Hard. - if you make them different types and go the
SomeAnimal
route to unify them, you get to easily add more animals - just make a type and implement theAnimal
class. But if you want to add more functions, you have to go through all those animals you already made and add implementations for the new function to each of theirAnimal
instances.