Home > Blockchain >  How to simulate groups/named groups with the parser?
How to simulate groups/named groups with the parser?


There is a library regex-applicative. I want to extract the file name from Content-Disposition HTTP header looking as:

  • attachment; filename="this is file name .ext"
  • attachment;filename=fname.ext
  • and similar...

It seems that the function getFile matches such fragments:

import Text.Regex.Applicative

getFile :: String -> Maybe (String, String, String)  -- prefix, RESULT, suffix
getFile hdr =
    unquotedName = many $ psym (/= ' ')
    quotedName = "\"" <> many (psym (/= '"')) <> "\""
    name = "filename" <> "=" <> (quotedName <|> unquotedName)
    parse = findFirstInfix name hdr

but how to extract the name of the file? In standard regexp we can use groups/named groups like filename=([^ ] ), so the name will be in the first group. But how to do it with my code above? I tried to add something like:

newtype FN = FN String deriving Show

... (FN <$> many (psym (/='"')) ...

but it seems I am doing it wrongly.


Not sure is it the most convenient way to do it:

data FN = FN String | N deriving Show
instance Semigroup FN where
  N <> a = a
  a <> _ = a

getFilename1 hdr =
    unquotedName = FN <$> (many $ psym (/= ' '))
    quotedName = (N <$ "\"") <> (FN <$> many (psym (/= '"'))) <> (N <$ "\"")
    name = (N <$ ("filename" <> "=")) <> (quotedName <|> unquotedName)
    parse = findFirstInfix name hdr


PS. Instead of FN - Maybe (First a) can be used sure.

CodePudding user response:

Use *> and <* instead of <> to drop the results of the irrelevant parts. For multiple groups, you can also use <$> and <*>. Read about parser combinators to learn more about this.

getFilename1 hdr = findFirstInfix name hdr
    unquotedName = many $ psym (/= ' ')
    quotedName = "\"" *> many (psym (/= '"')) <* "\""
    name = "filename" *> "=" *> (quotedName <|> unquotedName)
  • Related