Ok this topic had been discussed a lot of times, but since Haskell evolves lets consider it again to see what we can do in contemporary Haskell (by contemporary I mean GHC 9.0 - 9.2).
Fist let me state the problem by an example. Suppose we have a function which determines a number of bytes required to store a value of a given type. We can have two instances of this function: one for fixed sized data types and other for variable sized. For example Int32
is fixed sized and always takes 4 bytes to store regardless it's value. But data C = A Int32 | B Int32 Int32
can be considered variable sized since it may take 4 bytes to store in case of A
constructor or 8 bytes in case of B
constructor. It's natural to have two classes for this:
- A class for fixed sized values. Note that value itself not required, we can use
Proxy
as a parameter to determine the size.
class FixedSize a where
fixedSize :: p a -> Int
- A class for variable sized values. The function takes a value to determine the size.
class VariableSize a where
variableSize :: a -> Int
Now lets say we want to define a function which determines the size of a list of values. The values in the list can be either fixed or variable sized. So it's natural to have two functions:
- One for a list of fixed sized values.
listSize :: (FixedSize a) => [a] -> Int
listSize _ = (* fixedSize (Proxy @a) ) . length
- Other for a list of variable sized values.
listSize :: (VariableSize a) => [a] -> Int
listSize = sum . map variableSize
However it is not possible to use a naive approach, the following basically won't compile:
class Size a where
size :: a -> Int
instance (FixedSize a) => Size [a] where
size = _ = (* fixedSize (Proxy @a) ) . length
instance (VariableSize a) => Size [a] where
size = sum . map variableSize
This happens because Haskell relies on type when selecting an instance, but not on the context. There are trick to overcome this limitation described here: https://wiki.haskell.org/GHC/AdvancedOverlap. The basic idea is to define type-level predicate which reflects the context and use it to select an instance using multi-parameter type classes an overlapping instances. In this case Haskell will be able to select more specific instance based on the type parameters. By "more specific" I mean matching type-level predicates.
The proposed approaches can be divided into three groups conditionally.
Use closed type families to define a type-level predicate ("Solution 3" according to the wiki-page). This is not usable approach because it will disallow user to define instances for custom data types. I won't discuss it further.
Define the predicate as a separate type class, define default (fallback) overlappable instance for the predicate ("Solution 1" according to the wiki-page). This is working approach, but it requres from user to maintain additional instances for the predicate.
Use open type families ("Solution 2"). I'd like to discuss slightly modified version of this approach.
class Size a where
size :: a -> Int
class FixedSize a where
type FixedSized a :: Bool
type FixedSized a = 'True
fixedSize :: p a -> Int
#include "MachDeps.h"
instance FixedSize Int where
fixedSize _ = SIZEOF_HSINT
class ListSize (isElemFixed :: Bool) a where
listSize :: p isElemFixed -> a -> Int
instance (ListSize (FixedSized a) [a]) => Size [a] where
size = listSize $ Proxy @(FixedSized a)
instance (FixedSize a) => ListSize 'True [a] where
listSize _ = trace "elem size is fiхed" . (* fixedSize (Proxy @a) ) . length
instance {-# INCOHERENT #-} (Size a) => ListSize any [a] where
listSize _ = trace "elem size is variable" . sum . map size
test1 = size [1::Int,2,3]
test2 = size [[1::Int], [2,3,4]]
This approach seems the most convenient user-wise to me. The separate type-level predicate facility is still required and user can still mess up by defining something like this explicitly:
class FixedSize UserType where
type FixedSized UserType = 'False
but it just works as expected when using defaults.
However, it reqires incoherent instances. And I'm scare of incoherent instances. Because the Hasllkel documentation leterelly says that in case of incoherent instances the compiler is free to choose any instance it wants which looks unpredictable. Now I'll probably doing a bad thing by asking 4 questions in one post but they all related:
Why incoherent instances are needed here exactly? Does not
ListSize 'True [a]
just overlap withListSize any [a]
and could be picked when first paramenter evaluates toTrue
?Is there a way to break this code? I mean, to make a complier to choose
ListSize any [a]
(variable sized elem code) whenFixedSize a
is in scope?Are these instances really incoherent? Probably compiler just can't prove coherence, so how it can be proven manually?
Is there a completely different approach to solve this problem in modern Haskell? By the problem I mean a partial exapmle above, selecting an appropriate function to determine the size of a list of values based on the type of the values in compile time.
CodePudding user response:
This sounds like a use for if-instance.
It allows you to branch on whether a constraint is satisfied, obviously this is not safe since it violates the open-world assumption (see reddit post).
If you claim that your classes are mutually exclusive, then you should be able to safely write
{-# OPTIONS_GHC -fplugin=IfSat.Plugin #-}
import Data.Constraint.If ( type (||) (dispatch) )
instance FixedSize a || VariableSize a => Size [a] where
size :: [a] -> Int
size as = dispatch fixed variable where
fixed :: FixedSize a => Int
fixed = fixedSize (Proxy @a) * length as
variable :: VariableSize a => Int
variable = sum (map variableSize as)
CodePudding user response:
I'll focus on the specific example. Hopefully some of the solutions proposed below will also be applicable to the general case you have in mind.
There is a general theme, though, which is: we avoid having two different classes to choose from.
The plain, basic approach
If the size can be either fixed or variable, we can use a sum type to represent those two cases.
data Scale a = Fixed Int | Variable (a -> Int)
class Size a where
scale :: Scale a
instance Size Int where
scale = Fixed 4
instance Size a => Size [a] where
scale = case scale @a of
Fixed n -> Variable (\xs -> n * length xs)
Variable s -> Variable (\xs -> sum $ map s xs)
size :: forall a. Size a => a -> Int
size x = case scale @a of
Fixed n -> n
Variable s -> s x
Exploit GADTs
We start by enumerating the two cases:
data Sized = Fixed | Variable
-- associated singleton
data SSized (s :: Sized) where
SFixed :: SSized 'Fixed
SVariable :: SSized 'Variable
Then we use a type family to drive the two different types.
type family Scale s a where
Scale Fixed _ = Int
Scale Variable a = a -> Int
class Size a where
type F a :: Sized
which :: SSized (F a)
scale :: Scale (F a) a
Note how the class also provides the singleton. We can then instantiate the class.
instance Size Int where
type F Int = Fixed
which = SFixed
scale = 4
instance Size a => Size [a] where
type F [a] = Variable
which = SVariable
scale xs = case which @a of
SFixed -> scale @a * length xs
SVariable -> sum $ map (scale @a) xs
size :: forall a. Size a => a -> Int
size x = case which @a of
SFixed -> scale @a
SVariable -> scale @a x
Essentially, we collapsed two classes in one: the classes for fixed-sized and variably-sized types are now just one. At the type level, we can distinguish between the two cases using the type family F a
, while at the term level we can we can distinguish between the two cases using the singleton which @a
.
CodePudding user response:
One possibility would be to provide an additional class for the size of containers, together with newtypes to be used with deriving
. Users will still have to write an instance for this new class, but it will be a single, short line of code. Like this:
class ContainerSize a where
containerSize :: Foldable f => f a -> Int
newtype FixedElement a = FixedElement a
instance FixedSize a => ContainerSize (FixedElement a) where
containerSize c = fixedSize @a * length c
newtype VariableElement a = VariableElement { getVE :: a }
instance VariableSize a => ContainerSize (VariableElement a) where
containerSize c = sum (variableSize . getVE <$> c)
instance ContainerSize a => VariableSize [a] where
variableSize = containerSize
Now, when users would previously have written something like
instance FixedSize X where ...
instance VariableSize Y where ...
they now add the lines
deriving instance ContainerSize X via FixedElement X
deriving instance ContainerSize Y via VariableElement Y
(If this feels a bit like a special-case hack for lists, let me encourage you that this is very similar to something in the standard libraries that has worked well for several decades, showList
, so that's some evidence that it can cover quite a lot of ground after all!)