Home > Enterprise >  What good is the open world assumption?
What good is the open world assumption?

Time:04-05

As a (somewhat contrived) example, say I want to use generics to figure out when a type is uninhabited (by a non-bottom value). I can get most of the way there just fine:

class Absurd a where
    absurd :: a -> b
    default absurd :: (Generic a, GAbsurd (Rep a)) => a -> b
    absurd = gabsurd . from
class GAbsurd f where
    gabsurd :: f p -> b
instance GAbsurd f => GAbsurd (M1 i c f) where
    gabsurd (M1 x) = gabsurd x
instance Absurd c => GAbsurd (K1 i c) where
    gabsurd (K1 x) = absurd x
instance GAbsurd V1 where
    gabsurd = \case
instance (GAbsurd a, GAbsurd b) => GAbsurd (a : : b) where
    gabsurd (L1 x) = gabsurd x
    gabsurd (R1 x) = gabsurd x

The issue comes when I need to define an instance for :*:. Only one of GAbsurd a or GAbsurd b is required to implement GAbsurd (a :*: b), however the best I can do is to require both, which is too strong. For example:

data Void
data X = X Char Void

Obviously X is an "absurd" type, but that can't be inferred because although Void is absurd, Char is not and so the constraint (Absurd Char, Absurd Void) can't be solved.

What I would like to do is use some kind of "constraint disjunction" like this:

instance (GAbsurd a || GAbsurd b) => GAbsurd (a :*: b) where
    gabsurd (x :*: y) = case eitherConstraint of
        LeftC  -> gabsurd x
        RightC -> gabsurd y

However, under the open world assumption, that's not possible. In fact, the code in Data.Boring (which also holds Absurd) simply ignores the problem entirely:

There are two reasonable instances for GAbsurd (f :*: g), so we define neither

So this leads me to the question: why is the open world assumption so important? My best guess is that it has something to do with orphan instances, but even then shouldn't GHC have a list of all imported modules?

CodePudding user response:

I'm not sure that the critical problem is orphan instances and modules.

I think the main issue is that it's unexpectedly difficult to maintain coherence of instance definitions and efficiency of compilation in the face of constraint disjunctions. The problem is that the hypothetical eitherConstraint in your imagined syntax can't be simply, locally resolved at compile time. For example, consider the functions:

foo :: (GAbsurd a) => GAbsurd (a :*: b)
foo = gabsurd

bar :: (GAbsurd b) => GAbsurd (a :*: b)
bar = gabsurd

These are both type-correct. In each case, a GAbsurd (a :*: b) instance is demanded, and one can be supplied via the disjuction either by GAbsurd a or GAbsurd b. However, the obvious compile-time, local solution to actually implementing these functions, of using gabsurd (a :*: b) = gabsurd a for the first and gabsurd (a :*: b) = gabsurd b for the second, leads to incoherence. It's not a problem for this specific example, but if these were Ord instances and we were using one polymorphic function to insert objects into a Set and another polymorphic function to look objects up in that Set, we could have a real problem on our hands if they ended up using two different Ord instances for the same type.

In order to maintain coherence, we'd basically have to carry a lot of additional type information around to resolve disjunctions when selecting an instance. And, we might have to choose between passing that information around and evaluating it at runtime or else generating an exponential number of specialized instances for even pretty trivial programs.

  • Related