Using the MegaParsec parse
function, I'm able to run a parser, and get a ParseErrorBundle
if it fails.
I know that I'm able to pretty print the ParseErrorBundle
, and get an error message for the entire parse failure, which will include the line and character numbers, using errorBundlePretty
.
I also know that I'm able to get a list of ParseError
's from a ParseErrorBundle
, using bundleErrors
. And that I can pretty print these with either parseErrorPretty
or parseErrorTextPretty
.
I want to be able to run a parser, and if it fails, get a list of (SourcePos, Text)
, so that I know both the individual error messages, and the location of each error.
I can't figure out an elegant way to do this. While I could in theory crib fairly heavily from the source code to errorBundlePretty
, I feel like folding over the errors and using reachOffset
to advance the PosState
can't be the easiest way to go about this?.
CodePudding user response:
Note that, if you're using megaparsec >= 7.0.0
, I think you're supposed to use attachSourcePos
for the traversal. It returns a NonEmpty
of (ParseError, SourcePos)
pairs. I think it would look like:
import qualified Text.Megaparsec as MP
import qualified Data.Text as T
import Data.List.NonEmpty (NonEmpty (..))
import Data.Void
annotateErrorBundle :: MP.ParseErrorBundle T.Text Void -> NonEmpty (MP.SourcePos, T.Text)
annotateErrorBundle bundle
= fmap (\(err, pos) -> (pos, T.pack . MP.parseErrorTextPretty $ err)) . fst $
MP.attachSourcePos MP.errorOffset
(MP.bundleErrors bundle)
(MP.bundlePosState bundle)
Note that unlike your proposed answer, attachSourcePos
threads the PosState
properly through the traversal of the error bundle, rather than throwing the updated state away after every reachOffset
call. As a result, I believe it will be more efficient for a large number of errors. (It also uses reachOffsetNoLine
instead of reachOffset
which may be more efficient for certain stream types.
If you're using a megaparsec < 7.0.0
, you might want to try to adapt the source for attachSourcePos
from later versions.
CodePudding user response:
I was able to get this to work as follows:
import qualified Text.Megaparsec as MP
import Data.List.NonEmpty (NonEmpty (..))
import qualified Data.Text as T
annotateErrorBundle :: MP.ParseErrorBundle Text Void -> NonEmpty (MP.SourcePos, Text)
annotateErrorBundle bundle = (\e -> (errorSrcPos e, T.pack $ MP.parseErrorTextPretty e)) <$> MP.bundleErrors bundle
where
initialPosState = MP.bundlePosState bundle
errors = MP.bundleErrors bundle
errorSrcPos e = MP.pstateSourcePos . snd $ MP.reachOffset (MP.errorOffset e) initialPosState
I suspect that this probably isn't super efficient, because I'm calling reachOffset
once per error. However, in practice, the list of errors probably isn't that large, so I'm not too worried.