Home > Software design >  Parser for recursive expressions hangs in ghci
Parser for recursive expressions hangs in ghci

Time:12-02

I am trying to make a parser for the following recursive datatype:

data Expr = Val Int
          | Var Char
          | App Op Expr Expr
  deriving Show

data Op = Add | Sub | Mul | Div
  deriving Show

It should, for example, parse "(1 (a / -2))" as App Add (Val 1) (App Div (Var 'a') (Val (-2))). I've managed to write parsers for the Val and Var constructors as well as for Op's constructors like so:

import Text.Regex.Applicative
import Data.Char

rNonnegativeIntegral :: (Read a, Integral a) => RE Char a
rNonnegativeIntegral = read <$> some (psym isDigit)

rNegativeIntegral :: (Read a, Integral a) => RE Char a
rNegativeIntegral = negate <$> (sym '-' *> rNonnegativeIntegral)

rIntegral :: (Read a, Integral a) => RE Char a
rIntegral = rNonnegativeIntegral <|> rNegativeIntegral

rVal :: RE Char Expr
rVal = Val <$> rIntegral

rVar :: RE Char Expr
rVar = Var <$> psym isAlpha

rOp = aux <$> (foldr1 (<|>) $ map sym " -*/")
  where
    aux ' ' = Add
    aux '-' = Sub
    aux '*' = Mul
    aux '/' = Div

When this is loaded into ghci it can produce the following output:

ghci> findLongestPrefix rVal "-271"
Just (Val (-271), "")
ghci> findLongestPrefix rVar "a"
Just (Var 'a', "")
ghci> findLongestPrefix rOp "-"
Just (Sub, "")

The trouble comes when I introduce this recursive definition for the App constructor:

whiteSpace :: RE Char String
whiteSpace = many $ psym isSpace

strictWhiteSpace :: RE Char String
strictWhiteSpace = some $ psym isSpace

rApp :: RE Char Expr
-- flip App :: Expr -> Op -> Expr
-- strictWhiteSpace after rOp to avoid conflict with rNegativeInteger
rApp = flip App <$> (sym '(' *> whiteSpace *> rExpr)
               <*> (whiteSpace *> rOp <* strictWhiteSpace)
               <*> (rExpr <* whiteSpace <* sym ')')

rExpr :: RE Char Expr
rExpr = rVal <|> rVar <|> rApp

This loads into ghci just fine, and all previous constructors still work. But findLongestPrefix rApp "(1 a)" and many similar expressions cause ghci to hang and produce no output.

Through experimentation I've found that the issue happens in general when rExpr is passed in as the first argument to <*. For example, findLongestPrefix (rExpr <* whiteSpace) "a)" also causes ghci to hang.

Also, when the definition for rExpr is replaced by

rExpr = rVal <|> rVar

all of these hanging issues go away. Simple expressions like "(1 a)" are able to be parsed, but support for recursive expressions is not available.

How can I implement a recursive parser here without hanging issues?

CodePudding user response:

The language of expressions that you describe isn't regular. So you'll have to use a different library.

Luckily, essentially the same parser structure should work fine with most other parser combinator libraries. It should be as simple as substituting your new library's name for a few basic parsers in place of their regex-applicative analogs.

  • Related