Home > Enterprise >  Haskell parsing with parsec
Haskell parsing with parsec

Time:05-05

I'm fairly new to Haskell programming. I am currently trying to familiarize myself with some source code for parsing with parsec.

Two things struck me:

type Parser = String -> Tree

According to the author, this is supposed to be a declaration of a function? That should actually look like this:

Parser :: String -> Tree right?

Another ambiguity is in the following code:

item :: Parser Char
item = P (\inp -> case inp of......

From my understanding this is a type declaration of a function right?

Shouldn't this be

item :: Parser -> Char

Thanks for helping me

Best regards Christoph

CodePudding user response:

type Parser = String -> Tree

According to the author, this is supposed to be a declaration of a function? That should actually look like this:

More precisely, Parser is a type alias for the type String -> Tree which is a function which takes a String and returns a Tree.

So Parser is a new type name.

On the other hand, compare that to

Parser :: String -> Tree

This declares a function with the name Parser that has the type String -> Tree. So in one the first case we declare a new type and in the second case we declare a name with a particular type.

As mentioned in the comment, this should be

parser :: String -> Tree

because function names should start with lowercase and type names start with uppercase.

CodePudding user response:

I assume you are looking at this slide deck. What the author actually says is:

In a functional language such as Haskell, parsers can naturally be viewed as functions.

-- A parser is a function that takes a string 
-- and returns some form of a tree
type Parser = String -> Tree

What the author means by this is that parsers are functions, and the type of such a parser function might be the type Parser, which the author has defined as a type alias for the type String -> Tree.

That is, after defining the type alias Parser using:

type Parser = String -> Tree

an actual parser could be declared to have type Parser (equivalent to type String -> Tree) like so:

-- here is a very bad parser capable of parsing only one possible program
myTreeParser :: Parser
myTreeParser "2*3 4" = Plus (Times (Num 2) (Num 3)) (Num 4)
myTreeParser _ = error "Parser only supports the program '2*3 4'"

Consider the similar statement: "In a language supporting algebraic data types such as Haskell, points can natually be represented as pairs of floating point numbers: type Point = (Double, Double). The intention is not to define such a point but rather to describe the type of points by means of a type alias Point that is defined to be equivalent to the type (Double, Double) of pairs of floating point numbers.

The author goes on to refine the Parser type, pointing out that (1) parsers may not use all of their input and so should return (Tree, String) and not just Tree, (2) they may be able to parse their input in multiple ways (most importantly in "zero" ways, if the input cannot be parsed) and so should return a list of possible parse results [(Tree, String)], and (3) may need to parse something other than a Tree and so ought to be generalized to parse input into an arbitrary type:

type Parser a = String -> [(a, String)]

This defines a type alias Parser that is parameterized, meaning that Parser itself is not a complete type, but rather Parser Tree or Parser Char or Parser Double is a complete type, representing a parser that parses a Tree, Char, or Double.

So the declaration:

item :: Parser Char

is correct as written. Using the final version of this Parser type alias:

type Parser a = String -> [(a, String)]

the type Parser Char is equivalent to the type:

String -> [(Char, String)]

That is, item's type is that of a parser that accepts an input String and attempts to parse the beginning of the input to generate a Char value, plus the remainder of the String after the portion that was parsed. Ultimately, it returns zero or more such possible parses. (This particular definition of item returns either an empty list of zero parses, indicating no parse is possible, or a singleton list of precisely one parse, indicating the parse was successful.)

  • Related