Home > Blockchain >  How to parse polymorphically in Haskell
How to parse polymorphically in Haskell

Time:11-18

I'm trying to parse a folder of CSVs into a class of different specific types which then get compiled into a single XLSX. I have different datatypes representing the records because, while all of these are of the same category of thing, they have different fields and parsing methods:

data RecordTypeA = RecordTypeA { ... }
data RecordTypeB = RecordTypeB { ... }

instance MyItemClass RecordTypeA where
  ...
instance MyItemClass RecordTypeB where
  ...

-- and loads of instances for parsing, serialising, etc..

I run into two problems when trying to do this:

Firstly, I can't write a function which takes a FilePath as input and returns a list of the records appropriate to that file (I decide based on the file name). I run into the "rigid type variable" error (similarly to Rigid type variable in Haskell). I suppose this is a type of dynamic dispatch, but I don't know how to achieve it in haskell.

parseRecords :: FilePath -> ExceptT ProgrammeError IO [a]
parseRecords = {- parses based on the file name -}
-- uh oh: "... a is a rigid type variable ..."

Secondly, were I to have that function, I'm not entirely sure how I would represent a list of these polymorphic types, as it would have to be a heterogenous list all bound by the fact that they are members of a certain set of typeclasses. I do not know in advance how many files of the different types there will be.

processFiles :: FilePath -> ExceptT ProgrammeError IO (Compiled a)
processFiles = ???

How can I achieve this sort of polymorphic parsing in Haskell?

CodePudding user response:

The most straightforward thing to do is to lower this to the value level, instead of trying to do everything on the type level at compile time. Define a new type which is the sum of all the things you want to be able to parse in this context:

data SomeRecord = TypeA RecordTypeA | TypeB RecordTypeB ...

Then you can do any number of things - for example, define a list of type [Parser SomeRecord], and run each parser from the list on the input. Or depending on filename, choose a different list of parsers. Then the result of parseRecords can be [SomeRecord]. This is not the most precise type: you will have to cope with the possibility that a TypeA record is returned even if the list of parsers you supplied doesn't include the parser for TypeA, because they're all the same type.

  • Related