Assuming I have a program like so:
let words = ["foo", "bar", "baz", "qux"]
for (var i = 0; i < words.length; i ) {
console.log(words[i])
}
I want to filter the AST and only keep the literals. How can I traverse the AST to keep only "foo", "bar", etc?
literals :: JSAST -> [String]
literals (JSAstProgram statements) = undefined
literals _ = []
The undefined line should filter the [JSStatement]
, and then look for possible expressions.
If an expression is found, check if there's a literal.
Lastly, map the list from a JSStatement
-> String
.
Looking for the possible expressions is the problem for me right now. I could pattern match:
f :: JSStatement -> a
f (JSLet _ expression _) = undefined
f (JSConstant _ expression _) = undefined
f (JsDoWhile _ _ _ _ expression _ _) = undefined
... etc
And then same thing for looking for the literal:
g :: JSExpression -> a
g (JSLiteral _ literal) = undefined
g (JSStringLiteral _ literal) = undefined
g (JSArrayLiteral _ literal _) = undefined
... etc
However, this is very verbose and I feel like there's something better out there. How do I traverse through the AST and filter out only the literals in a clean and concise manner?
CodePudding user response:
All the types in Language.JavaScript.Parser
are instances of Data.Data
, a class for folding generically over data constructors. You can use gmapQ
, a function from this class, to traverse the structure, and cast
to determine when you've found a node of interest. Below is an implementation for a simpler example AST type.
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Data (Data, gmapQ)
import Data.Typeable (cast)
data Lit = S String | I Int deriving Data
data Expr = Sum Expr Expr | LitExpr Lit deriving Data
data Statement = Stmt [Expr] deriving Data
data AST = AST [Statement] deriving Data
strings :: AST -> [String]
strings = concat . gmapQ go
where go :: Data d => d -> [String]
go x = case cast x of
Nothing -> concat $ gmapQ go x
Just (I _) -> []
Just (S s) -> [s]
λ> strings (AST [Stmt [], Stmt [LitExpr (S "x"), Sum (LitExpr (I 5)) (LitExpr (S "y")), LitExpr (S "z")]])
["x","y","z"]
Note by the way that I needed to give go
an explicit type signature. I think this is because otherwise GHC was inferring that the recursive call to go
must use the same d
constraint as the enclosing call, but of course we need them to differ.
CodePudding user response:
You're right that these sorts of things are usually boilerplate; and Haskell has excellent libraries to let you code such functions easily. One of my favorites is uniplate: https://hackage.haskell.org/package/uniplate
Following @amalloy's example, here's how you'd code the same using Uniplate:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Data
import qualified Data.Generics.Uniplate.Data as G
data Lit = S String | I Int deriving Data
data Expr = Sum Expr Expr | LitExpr Lit deriving Data
data Statement = Stmt [Expr] deriving Data
data AST = AST [Statement] deriving Data
strings :: AST -> [String]
strings a = [s | S s <- G.universeBi a]
And then:
*Main> strings (AST [Stmt [], Stmt [LitExpr (S "x"), Sum (LitExpr (I 5)) (LitExpr (S "y")), LitExpr (S "z")]])
["x","y","z"]
Note that the definition of strings is literally a one-liner. So, all you need is one line of import, and one line for extracting all literals using uniplate's universeBi
function, which is literally the expression:
[s | S s <- G.universeBi ast]
which pretty much reads exactly what you intended: Traverse the entire AST, and find me all literal strings appearing anywhere. Doesn't get any more declarative than that!