Home > OS >  Preprocessor for haskell source: is cpp the only option?
Preprocessor for haskell source: is cpp the only option?

Time:09-27

I can see from plenty of Q&As that cpp is the usual preprocessor for Haskell source; but that it isn't a good fit for the job. What other options are there?

Specifically:

  • Haskell syntax is newline-sensitive and space/indent-sensitive -- unlike C, so cpp just tramples on whitespace;
  • ' in Haskell might surround a character literal, but also might be part of an identifier (in which case it won't be paired) -- but cpp complains if not a char literal;
  • \ gets a trailing space inserted -- which is not a terrible inconvenience, but I'd prefer not.

I'm trying to produce a macro to generate an instance from parameters for a newtype type and corresponding data constructor. It needs to generate both the instance head and constraints and a method binding. By just slotting the constructors into an instance skeleton.

(Probably Template Haskell could do this; but it seems rather a large hammer.)

CodePudding user response:

cpphs seems to be just about enough for my (limited) purposes. I'm adding this answer for the record; an answer suggesting cpphs (and some sensible advice to prefer Template Haskell) was here and then gone.

But there's some gotchas that meant at first sight I'd overlooked how it helped.

Without setting any options, it behaves too much like cpp to be helpful. At least:

  • It doesn't complain about unpaired '. Indeed you can #define dit ' and that will expand happily.
  • More generally, it doesn't complain about any nonsense input: it grimly carries on and produces some sort of output file without warning you about ill-formed macro calls.
  • It doesn't insert space after \.
  • By default, it smashes together multiline macro expansions, so tramples on whitespace just as much.
  • Its tokenisation seems to get easily confused between Haskell vs C. specifically, using C-style comments /* ... */ seems to upset not only those lines, but a few lines below. (I had a #define I wanted to comment out; should have used Haskell style comments {- ... -} -- but then that appears in the output.)
  • The calling convention for macros is C style, not Haskell. myMacro(someArg) -- or myMacro (someArg) seems to work; but not myMacro someArg. So to embed a macro call inside a Haskell expression probably needs surrounding the lot in extra parens. Looks like (LISP).
  • A bare macro call on a line by itself myInstance(MyType, MyConstr) would not be valid Haskell. The dear beastie seems to get easily confused, and fails to recognise that's a macro call.
  • I'm nervous about # and ## -- because in cpp they're for stringisation and catenation. I did manage to define (##) = ( ) and it seemed to work; magicHash# identifiers seemed ok; but I didn't try those inside macro expansion.

Remedies

(The docos don't make this at all obvious.)

  • To get multi-line output from a multi-line macro def'n, and preserving spaces/indentation (yay!) needs option --layout. So I have my instance definition validly expanded and indented.
  • If your tokenisation is getting confused, maybe --text will help: this will "treat input as plain text, not Haskell code" -- although it does still tolerate ' and \ better. (I didn't encounter any downsides from using --text -- the Haskell code seemed to get through unscathed, and the macros expanded.)
  • If you have a C-style comment that you don't want to appear in output, use --strip.
  • There's an option --hashes, which I imagine might interact badly with magicHash#.
  • The output file starts with a header line #line .... The compiler won't like that; suppress with --noline.

CodePudding user response:

I would say that Template Haskell is the most perfect tool for this purpose. It is the standard set of combinators for constructing correct Haskell source code. After that there is GHC.Generics, which might allow you to write a single instance that would cover any type which is an instance of Generic.

  • Related