I am studying parsers in Haskell following definitions from G. Hutton, E. Meijer - Monadic Parsing in Haskell.
data Parser a = Parser { parseWith :: String -> [(a, String)] }
instance Functor Parser where
fmap f (Parser p) = Parser $ \s -> [(f a, rest) | (a, rest) <- p s]
instance Applicative Parser where
pure x = Parser $ \s -> [(x, s)]
(Parser p1) <*> (Parser p2) = Parser $ \s -> [(f x, r2) | (f, r1) <- p1 s, (x, r2) <- p2 r1]
instance Monad Parser where
return = pure
p >>= f = Parser $ \s -> concatMap (\(x, r) -> parseWith (f x) r) $ parseWith p s
instance Alternative Parser where
empty = failure
p1 <|> p2 = Parser $ \s ->
case parseWith p1 s of
[] -> parseWith p2 s
res -> res
Essentially I have a (parsed :: a, remaining :: String)
context.
As a simple application, I defined the following ADT to parse:
data Arr = Arr Int [Int] -- len [values]
and a parser that can construct Array
values from strings, e.g.:
"5|12345" -> Arr 5 [1,2,3,4,5]
First, in order to parse n
such Array
values (the string input contains n
on the first position), e.g.:
"2 3|123 4|9876 2|55" -> [Arr 3 [1,2,3], Arr 4 [9,8,7,6]]
I can do the following:
arrayParse :: Parser Arr
arrayParse = do
len <- digitParse
vals <- exactly len digitParse
return $ Arr len vals
nArraysParse :: Parser [Arr]
nArraysParse = do
n <- digitParse
exactly n arrayParse
where exactly n p
constructs a new parser by applying p
n
times.
Next, I want to parse a different scheme. Suppose the first character denotes the length of the sub-string defining the arrays, e.g.:
"9 3|123 4|9876 2|55" -> [Arr 3 [1,2,3], Arr 4 [9,8,7,6]]
Meaning that I have to apply arrayParse
on the first n
chars (excluding |
and whitespace) to get the first 2 arrays:
3|123 -> 4 chars (excluding | and whitespace)
4|9876 -> 5 chars (excluding | and whitespace)
So, it's straightforward to apply a parser n
times:
exactly :: Int -> Parser a -> Parser [a]
exactly 0 _ = pure []
exactly n p = do
v <- p -- apply parser p once
v' <- exactly (n-1) p -- apply parser p n-1 times
return (v:v')
but how can I express the intent of applying a parser on the first n
characters?
My initial approach was something like this:
foo :: Parser [Arr]
foo = do
n <- digitParse
substring <- consume n
-- what to do with substring?
-- can I apply arrayParse on it?
How should I approach this?
CodePudding user response:
Following @jlwoodwa's advice, I managed to achieve the following:
innerParse :: Parser a -> String -> Parser a
innerParse p s = case parseWith p s of
[(arr, "")] -> return arr
_ -> failure
substringParse :: Parser [Arr]
substringParse = do
n <- digitParse
substring <- consume n
innerParse (zeroOrMore arrayParse) substring
which works for my use-case.