Home > database >  Clarifying numeric literal definition in haskell
Clarifying numeric literal definition in haskell

Time:10-04

In the haskell report 98 we have the following definition:

https://www.haskell.org/onlinereport/haskell2010/haskellch6.html#x13-1360006.4.1

6.4.1 Numeric Literals The syntax of numeric literals is given in Section 2.5. An integer literal represents the application of the function fromInteger to the appropriate value of type Integer. Similarly, a floating literal stands for an application of fromRational to a value of type Rational (that is, Ratio Integer). Given the typings:

fromInteger :: (Num a) => Integer -> a

fromRational :: (Fractional a) => Rational -> a

integer and floating literals have the typings (Num a) => a and (Fractional a) => a, respectively.

Numeric literals are defined in this indirect way so that they may be interpreted as values of any appropriate numeric type. See Section 4.3.4 for a discussion of overloading ambiguity.

I fully understand the intent and purpose of the specification. However out of curiosity I would like to know where and how is implemented the "represents" and "stands". That is where and how implemented that definition of the literals "standing/representing" fromInteger and fromRational. Is it somewhere deep in GHC, or does it stand in a place where we can see it kind of easily ?

CodePudding user response:

I think you’re reading too much into the precise wording of this definition. All it means is that an integer literal like 1 is really interpreted as fromInteger (1 :: Integer); similarly a floating literal like 0.5 is really interpreted as fromRational (0.5 :: Rational). This is the mechanism which allows numeric literals to be polymorphic in Haskell.

CodePudding user response:

For GHC, it's actually kind of complicated. It's definitely true that this happens deep in the GHC compiler code, and not because of some definition that you'll find in one of the base modules, so if that's all you wanted to know, there's your answer. Here are some more details...

Short answer: GHC makes note of the fact that 2 might need to be calculated via fromInteger very early on, in the "renaming" phase right after parsing. However, the representation of 2 in the abstract syntax tree output by the renamer is a clearly identifiable "literal" node. Subsequently in the "type checking phase", a potential replacement expression may be added to that "literal" node, and this replacement might be fromIntegral (2 :: Integer), but it might also be a more direct coding of the literal in its final type. In any event, the node remains a clearly identifiable "literal" node. The actual translation of the literal node is only finalized in the "desugaring" phase when the AST is converted to Core, and the desugarer either uses the advisory expression produced by the type checker or completely ignores it in favor of resolving the literal directly. In summary, 2 is never unconditionally "defined" as fromInteger (2 :: Integer) but is eventually desugared either to that expression or an appropriate alternative via a complex interplay between the renamer, type checker, and desugarer.

Long answer: If you compile the following program:

module NumericLiteral where
double :: Double -> Double
double x = 2 * x

with ghc -ddump-parsed-ast -fforce-recomp Double.hs to dump the compiler's abstract syntax tree after parsing but before renaming and search through the output, you'll find an HsOverLit (for "overloaded literal") node. With the extra annotations stripped out, it looks like:

HsOverLit (OverLit (HsIntegral 2) (HsLit (HsString "noExpr")))

The last field is the "witness" for the overloaded literal, and the value here is a placeholder. If you then dump the AST after renaming with ghc -ddump-rn-ast -fforce-recomp Double.hs, you'll find the updated node now has the form:

HsOverLit (OverLit (HsIntegral 2)
                   (HsVar (Name "fromInteger")))

So, the function fromInteger is introduced very early on, but the representation does not take the same form as a function call fromInteger (2 :: Integer). That would be an HsApp function application node. The primary reason the renamer is involved in this process is in order to support the RebindableSyntax extension which requires that fromInteger be resolved to the "right name" from the very beginning.

Later during type checking, the node is further elaborated. For the program above, if you dump the type checker output with ghc -ddump-tc-ast -fforce-recomp Double.hs, you'll get something like the following (with 8.2.2; newer versions of GHC produce a more complicate AST with multiple HsOverLit nodes):

HsOverLit (OverLit (HsIntegral 2)
                   (HsApp (HsWrap <coercion_evidence> (Var "fromInteger"))
                          (HsLit (HsInteger 2)))

At this point, an application fromInteger (2 :: Integer) is now visible, but it's still part of a "witness" field. So, it would be inaccurate to say that the original literal has been replaced with fromInteger (2 :: Integer) at this stage, but GHC has made a sort of annotation that the literal could be replaced with this expression later.

The type checker doesn't always resolve the node this way, however. If it has enough local information, it sometimes ignores the renamer-supplied witness (fromInteger) entirely and rewrites the node with a different witness expression. For example, the modified program:

module NumericLiteral where
double :: Double -> Double
double x = (2 :: Double) * x

has the same HsOverLit representation after parsing and renaming, but after type checking the node looks like:

HsOverLit (OverLit (HsIntegral 2)
                   (HsApp (HConLikeOut <boxed double contructor>)
                          (HsLit (HsDoublePrim (FL "2" (:% 2 1))))))

Again, this is still clearly an HsOverLit node and the type checker witness should be considered a "possible replacement" and not an actual replacement.

That's because, regardless of which witness expression the type checker calculates, it's the desugarer that gets final say on the translation of the node. The function dsOverLit in compiler/GHC/HsToCore/Match/Literal.hs is responsible:

dsOverLit :: HsOverLit GhcTc -> DsM CoreExpr
dsOverLit (OverLit { ol_val = val, ol_ext = OverLitTc rebindable ty
                   , ol_witness = witness }) = do
  dflags <- getDynFlags
  let platform = targetPlatform dflags
  case shortCutLit platform val ty of
    Just expr | not rebindable -> dsExpr expr        -- Note [Literal short cut]
    _                          -> dsExpr witness

Note that it first applies shortCutLit from the whimsically named compiler/GHC/Tc/Utils/Zonk.hs module:

shortCutLit :: Platform -> OverLitVal -> TcType -> Maybe (HsExpr GhcTcId)
shortCutLit platform (HsIntegral int@(IL src neg i)) ty
  | isIntTy ty  && platformInIntRange  platform i = Just (HsLit noExtField (HsInt noExtField int))
  | isWordTy ty && platformInWordRange platform i = Just (mkLit wordDataCon (HsWordPrim src i))
  | isIntegerTy ty = Just (HsLit noExtField (HsInteger src i ty))
  | otherwise = shortCutLit platform (HsFractional (integralFractionalLit neg i)) ty

-- plus more cases for `HsFractional` and `HsIsString` literals

If shortCutLit identifies a special case, the witness produced by the renamer and type checker is ignored, and the literal is directly desugared to a literal in Core. It's only when shortCutLit fails that the desugarer uses the type checker's witness in the desugared output.

So, to sum up, the renamer identifies a (possibly rebound) fromInteger function that might be relevant to desugaring 2. The type checker selects a "witness" expression, either fromInteger (2 :: Integer) or some more direct translation if enough local information is available, and the desugarer -- with more type information available -- makes the final decision about the translation of the clearly identifiable literal 2 node, either translating it into the type checker's witness (fromInteger (2 :: Integer) or something more direct) or ignoring the type checker's witness and performing its own direct translation.

  • Related