Home > front end >  Inconsistency in GHCi between representations of values of type String and [Char]
Inconsistency in GHCi between representations of values of type String and [Char]

Time:07-24

Take the following definitions:

p = "abc"
q = ['a', 'b', 'c']

Because strings are lists of characters, the expression p == q will evaluate to True. That's just fine.

When I let GHCi evaluate expression p, it shows value "abc". That is expected, since p is of type String. (In GHCi, the command :type p shows p :: String.) But when I let GHCi evaluate expression q, it also shows value "abc". That is actually unexpected, since q is of type [Char], not String. (In GHCi, the command :type q shows q :: [Char], not q :: String.) So despite the type equality of String and [Char], I would expect q to be shown as a list of characters, not as a string of characters.

As I see it, the expression ['a', 'b', 'c'] :: String should produce the string value "abc", while the expression "abc" :: [Char] should produce the list value ['a', 'b', 'c']. But it doesn't seem to work that way. I consider that to be strange. So how can I actually show both p and q as a list of characters (['a', 'b', 'c']) instead of as a string of characters ("abc")? And if that is not possible, why is that not possible? (If a string is a list of characters, shouldn't I be able to show it as a list of characters as well?)

CodePudding user response:

String and [Char] are exactly the same type. The Show typeclass contains a "trick" to print that type using a different syntax than other lists.

If you need the regular list-like output, you could use a custom Char-like type to bypass that exception. Note that you will need a conversion to that type before printing the result.

{-# LANGUAGE TypeApplications #-}

import Data.Coerce

newtype C = C Char

instance Show C where
   show (C c) = show c

main :: IO ()
main = do 
   print $ map C ['a', 'b', 'c']
      -- output: ['a', 'b', 'c']
   print $ coerce @[Char] @[C] ['a', 'b', 'c']
      -- output: ['a', 'b', 'c']

The map C way is simpler but less efficient. The coerce way uses safe coercions to make a zero-cost conversion from [Char] to [C].

CodePudding user response:

So how can I actually show both p and q as a list of characters (['a', 'b', 'c']) instead of as a string of characters ("abc")?

You can’t. :type has the ability to make the distinction because it’s part of GHCi, but Show (what determines how the value is presented) is a typeclass. The type alias String can’t have a different implementation of Show from [Char].

CodePudding user response:

String is merely a type alias for [Char]; they are exactly the same type.

In fact, "abc" is just syntactic sugar for ['a', 'b', 'c']1. Both p and q are exactly the same as data structures in memory. It isn't possible for any Haskell function to be passed these values and tell which one was defined with double-quote syntax and which with square-bracket syntax. So it isn't possible for show p to be different from show q.

What perhaps should be more surprising is that GHCi prints different things in response to :type p and :type q. String and [Char] are the same type, so how can they be displayed differently?

However, displaying the types of things is not really a Haskell language feature; at runtime functions can only inspect the values of their arguments, not their types. Printing of types only happens "outside" the Haskell language proper: in error messages from the compiler, and in diagnostic messages in the interpreter like :type (if it were possible for :type to be an ordinary Haskell function it would be, and there wouldn't be a special interpreter command starting with a colon). So GHC (and GHCi) goes to some lengths to remember which way equivalent types were written by the user, and use the same (or similar) way of writing types in its output messages. This is purely an attempt to provide more helpful output; it isn't saying there is anything different about the two types, it's just trying to speak to you using the same terminology you used to speak to it.

For example, you can see this happening almost exactly the same way with a custom type alias you define:

Prelude> type Foo = Int
Prelude> x = 1 :: Int
Prelude> x
1
Prelude> :t x
x :: Foo
Prelude> y = 1 :: Int
Prelude> y
1
Prelude> :t y
y :: Int

1 Actually suare-bracket syntax for lists is also just syntactic sugar for the use of the data constructors of the list ADT: namely : and []. So both "abc" and ['a', 'b', 'c'] are just alternate ways of writing the same runtime value: 'a' : 'b' : 'c' : []

  • Related