Home > Software engineering >  How to parse a character range into a tuple
How to parse a character range into a tuple

Time:01-07

I want to parse strings like "0-9" into ('0', '9') but I think my two attempts look a bit clumsy.

numRange :: Parser (Char, Char)
numRange = (,) <$> digitChar <* char '-' <*> digitChar

numRange' :: Parser (Char, Char)
numRange' = liftM2 (,) (digitChar <* char '-') digitChar

I kind of expected that there already is an operator that sequences two parsers and returns both results in a tuple. If there is then I can't find it. I'm also having a hard time figuring out the desired signature in order to search on hoogle.

I tried Applicative f => f a -> f b -> f (a, b) based off the signature of <* but that only gives unrelated results.

CodePudding user response:

The applicative form:

numRange = (,) <$> digitChar <* char '-' <*> digitChar

is standard. Anyone familiar with monadic parsers will immediately understand what this does.

The disadvantage of the liftM2 (or equivalently liftA2) form, or of a function with signature:

pair :: Applicative f => f a -> f b -> f (a, b)
pair = liftA2 (,)

is that the resulting parser expressions:

pair (digitChar <* char '-') digitChar
pair digitChar (char '-' *> digitChar)

obscure the fact that the char '-' syntax is not actually part of either digit parser. As a result, I think this is more likely to be confusing than the admittedly ugly applicative syntax.

CodePudding user response:

I kind of expected that there already is an operator that sequences two parsers and returns both results in a tuple.

There is; it's liftA2 (,) as you noticed. However, you aren't sequencing two parser, you are sequencing three parsers. Even though you can treat this as a "metasequence" of two two-parser sequencing operations, those two operations are different:

  1. In digitChar <* char '-', you ignore the result of the second parser (and in my opinion, <* always looks like a typo for <*>).
  2. In ... <*> digitChar, you use both results.

If you don't like using the applicative operators directly, consider using do syntax along with the ApplicativeDo extension and write

numRange :: Parser (Char, Char)
numRange = do
    x <- digitChar
    char '-'
    y <- digitChar
    return (x,y)

It's longer, but it's arguably more readable than either of the two using <*, which I always think looks like a typo for <*>.

  • Related