Home > Net >  Apply parser n times vs on the first n characters (Haskell)
Apply parser n times vs on the first n characters (Haskell)

Time:12-20

I am studying parsers in Haskell following definitions from G. Hutton, E. Meijer - Monadic Parsing in Haskell.

data Parser a = Parser { parseWith :: String -> [(a, String)] }

instance Functor Parser where
    fmap f (Parser p) = Parser $ \s -> [(f a, rest) | (a, rest) <- p s]

instance Applicative Parser where
    pure x = Parser $ \s -> [(x, s)]
    (Parser p1) <*> (Parser p2) = Parser $ \s -> [(f x, r2) | (f, r1) <- p1 s, (x, r2) <- p2 r1]

instance Monad Parser where
    return = pure
    p >>= f = Parser $ \s -> concatMap (\(x, r) -> parseWith (f x) r) $ parseWith p s

instance Alternative Parser where
    empty = failure
    p1 <|> p2 = Parser $ \s ->
        case parseWith p1 s of
        []  -> parseWith p2 s
        res -> res

Essentially I have a (parsed :: a, remaining :: String) context.


As a simple application, I defined the following ADT to parse:

data Arr = Arr Int [Int] -- len [values]

and a parser that can construct Array values from strings, e.g.:

"5|12345" -> Arr 5 [1,2,3,4,5]

First, in order to parse n such Array values (the string input contains n on the first position), e.g.:

"2 3|123 4|9876 2|55" -> [Arr 3 [1,2,3], Arr 4 [9,8,7,6]]

I can do the following:

arrayParse :: Parser Arr
arrayParse = do
    len <- digitParse
    vals <- exactly len digitParse
    return $ Arr len vals

nArraysParse :: Parser [Arr]
nArraysParse = do
    n <- digitParse
    exactly n arrayParse

where exactly n p constructs a new parser by applying p n times.


Next, I want to parse a different scheme. Suppose the first character denotes the length of the sub-string defining the arrays, e.g.:

"9 3|123 4|9876 2|55" -> [Arr 3 [1,2,3], Arr 4 [9,8,7,6]]

Meaning that I have to apply arrayParse on the first n chars (excluding | and whitespace) to get the first 2 arrays:

3|123  -> 4 chars (excluding | and whitespace)
4|9876 -> 5 chars (excluding | and whitespace)

So, it's straightforward to apply a parser n times:

exactly :: Int -> Parser a -> Parser [a]
exactly 0 _ = pure []
exactly n p = do
    v  <- p               -- apply parser p once
    v' <- exactly (n-1) p -- apply parser p n-1 times
    return (v:v')

but how can I express the intent of applying a parser on the first n characters?

My initial approach was something like this:

foo :: Parser [Arr]
foo = do
   n <- digitParse
   substring <- consume n
   -- what to do with substring?
   -- can I apply arrayParse on it?

How should I approach this?

CodePudding user response:

Following @jlwoodwa's advice, I managed to achieve the following:

innerParse :: Parser a -> String -> Parser a
innerParse p s = case parseWith p s of
    [(arr, "")] -> return arr
    _ -> failure

substringParse :: Parser [Arr]
substringParse = do
    n <- digitParse
    substring <- consume n
    innerParse (zeroOrMore arrayParse) substring

which works for my use-case.

  • Related