There is a library regex-applicative. I want to extract the file name from Content-Disposition HTTP header looking as:
- attachment; filename="this is file name .ext"
- attachment;filename=fname.ext
- and similar...
It seems that the function getFile
matches such fragments:
import Text.Regex.Applicative
...
getFile :: String -> Maybe (String, String, String) -- prefix, RESULT, suffix
getFile hdr =
parse
where
unquotedName = many $ psym (/= ' ')
quotedName = "\"" <> many (psym (/= '"')) <> "\""
name = "filename" <> "=" <> (quotedName <|> unquotedName)
parse = findFirstInfix name hdr
but how to extract the name of the file? In standard regexp we can use groups/named groups like filename=([^ ] )
, so the name will be in the first group. But how to do it with my code above?
I tried to add something like:
newtype FN = FN String deriving Show
...
... (FN <$> many (psym (/='"')) ...
but it seems I am doing it wrongly.
EDIT:
Not sure is it the most convenient way to do it:
data FN = FN String | N deriving Show
instance Semigroup FN where
N <> a = a
a <> _ = a
getFilename1 hdr =
parse
where
unquotedName = FN <$> (many $ psym (/= ' '))
quotedName = (N <$ "\"") <> (FN <$> many (psym (/= '"'))) <> (N <$ "\"")
name = (N <$ ("filename" <> "=")) <> (quotedName <|> unquotedName)
parse = findFirstInfix name hdr
EDIT:
PS. Instead of FN
- Maybe (First a)
can be used sure.
CodePudding user response:
Use *>
and <*
instead of <>
to drop the results of the irrelevant parts. For multiple groups, you can also use <$>
and <*>
. Read about parser combinators to learn more about this.
getFilename1 hdr = findFirstInfix name hdr
where
unquotedName = many $ psym (/= ' ')
quotedName = "\"" *> many (psym (/= '"')) <* "\""
name = "filename" *> "=" *> (quotedName <|> unquotedName)