Home > Blockchain >  Left recursive arithmetic when creating parser with Haskell
Left recursive arithmetic when creating parser with Haskell

Time:12-17

I am trying to create a parser in Haskell using the readP library that is left associative in it's (arithmetic) output. In the simplified code below I obviously get either an infinite loop in case pOp is called in the left part of an expression (see outcommented code) or I get a right associative output such as 2 (4 (6 8)) the equivalent of:

ghci> parseString "2 4 6 8"
[(Oper Plus (Const (IntVal 2)) (Oper Plus (Const (IntVal 4)) (Oper Plus (Const (IntVal 6)) (Const (IntVal 8)))),"")]

MVE:

import Data.Char

import Text.ParserCombinators.ReadP
--import Text.Parser.Char
import Control.Applicative ((<|>))


type Parser a = ReadP a 

data Value =
  IntVal Int
  deriving (Eq, Show, Read)

data Exp =
    Const Value
  | Oper Op Exp Exp
  deriving (Eq, Show, Read)

data Op = Plus
  deriving (Eq, Show, Read)

space :: Parser Char
space = satisfy isSpace


spaces :: Parser String 
spaces = many space


space1 :: Parser String
space1 = many1 space

symbol :: String -> Parser String
symbol = token . string

token :: Parser a -> Parser a
token combinator = (do spaces
                       combinator)


parseString input = readP_to_S (do                         
                        e <- pExpr 
                        token eof
                        return e) input


pExpr :: Parser Exp 
pExpr = 
    (do
            pv <- pOp
            return pv)
        <|>
    (do         
        pv <- numConst 
        skipSpaces       
        return pv) 


numConst :: Parser Exp
numConst = 
        (do  
            skipSpaces
            y <- munch isDigit            
            return (Const (IntVal (read y)))
        )

pOp :: Parser Exp 
pOp = (do 
        e1 <- numConst -- pExpr 
        skipSpaces          
        op <- symbol " "                           
        e2 <- pExpr
        pv <- pOper op e1 e2 --
        return pv)
        

pOper :: String -> Exp -> Exp -> Parser Exp 
pOper " " exp1 exp2 = (do return (Oper Plus exp1 exp2))

I have tried different strategies such as using look from the above mentioned documentation to look ahead with the intention to then take the string returned and apply a parenthesis around it using "(" e ")" where e is the expression, and then have a seperat function deal with that call on the parenthesized expression in order to avoid a loop. But this isn't a viable solution since you can't use the readP library functions on the resulting value of look the same way you would use it on the original input (of look).

Any ideas how to solve this problem. I don't know if I am jsut getting the grammer (BNF) stated incorrectly to begin with and that I am really just approaching the problem from the wrong angle. But I don't think so.

CodePudding user response:

Due to above comment I was now able to come up with a solution based on a change in grammar.

parseString input = readP_to_S (do                         
                        e <- pExpr 
                        token eof
                        return e) input


pExpr :: Parser Exp 
pExpr = 
    (do
        pv <- numConst
        pv2 <- pOpRight pv               
        return pv2
        )
        <|>
    (do         
        pv <- numConst 
        skipSpaces       
        return pv)
      
   
numConst :: Parser Exp
numConst = 
        (do  
            skipSpaces
            y <- munch isDigit            
            return (Const (IntVal (read y)))
        )

pOpRight :: Exp -> Parser Exp 
pOpRight e1 = (do                 
        op <- symbol " "
        skipSpaces
        e2 <- pExpr        
        pv <- pOper op e1 e2 --
        return pv)

pOper :: String -> Exp -> Exp -> Parser Exp 
pOper " " exp1 exp2 = (do return (Oper Plus exp1 exp2))

CodePudding user response:

Consider chainl1 :: ReadP a -> ReadP (a -> a -> a) -> ReadP a. Most parser combinator libraries will have chainl and chainr families of functions, for parsing operators in a left- or right-associative way. You could apply it like:

pExpr = chainl1 numConst (Oper Plus <$ symbol " ")

This addresses your current issue, but once you add a second operator you'll have some decisions to make about precedence. If you want and * to have the same precedence (unusual), you can just use this approach but enrich the separator parser. If you want * to have higher precedence, you'll need to break pExpr up into multiple parsers: instead of Expr, parse a Sum of Products, stuff like that. Then you'll need one chainl1 to parse the terms of a Sum, where each term is itself parsed with chainl1 to produce a Product.

  • Related