Home > OS >  How to traverse the AST using the language-javascript module?
How to traverse the AST using the language-javascript module?

Time:11-24

Assuming I have a program like so:

let words = ["foo", "bar", "baz", "qux"]

for (var i = 0; i < words.length; i  ) {
  console.log(words[i])
}

I want to filter the AST and only keep the literals. How can I traverse the AST to keep only "foo", "bar", etc?

literals :: JSAST -> [String]
literals (JSAstProgram statements) = undefined
literals _ = []

The undefined line should filter the [JSStatement], and then look for possible expressions. If an expression is found, check if there's a literal. Lastly, map the list from a JSStatement -> String.

Looking for the possible expressions is the problem for me right now. I could pattern match:

f :: JSStatement -> a
f (JSLet _ expression _) = undefined
f (JSConstant _ expression _) = undefined
f (JsDoWhile _ _ _ _ expression _ _) = undefined
... etc

And then same thing for looking for the literal:

g :: JSExpression -> a
g (JSLiteral _ literal) = undefined
g (JSStringLiteral _ literal) = undefined
g (JSArrayLiteral _ literal _) = undefined
... etc

However, this is very verbose and I feel like there's something better out there. How do I traverse through the AST and filter out only the literals in a clean and concise manner?

CodePudding user response:

All the types in Language.JavaScript.Parser are instances of Data.Data, a class for folding generically over data constructors. You can use gmapQ, a function from this class, to traverse the structure, and cast to determine when you've found a node of interest. Below is an implementation for a simpler example AST type.

{-# LANGUAGE DeriveDataTypeable #-}

import Data.Data (Data, gmapQ)
import Data.Typeable (cast)

data Lit = S String | I Int deriving Data
data Expr = Sum Expr Expr | LitExpr Lit deriving Data
data Statement = Stmt [Expr] deriving Data
data AST = AST [Statement] deriving Data

strings :: AST -> [String]
strings = concat . gmapQ go
  where go :: Data d => d -> [String]
        go x = case cast x of
          Nothing -> concat $ gmapQ go x
          Just (I _) -> []
          Just (S s) -> [s]

λ> strings (AST [Stmt [], Stmt [LitExpr (S "x"), Sum (LitExpr (I 5)) (LitExpr (S "y")), LitExpr (S "z")]])
["x","y","z"]

Note by the way that I needed to give go an explicit type signature. I think this is because otherwise GHC was inferring that the recursive call to go must use the same d constraint as the enclosing call, but of course we need them to differ.

CodePudding user response:

You're right that these sorts of things are usually boilerplate; and Haskell has excellent libraries to let you code such functions easily. One of my favorites is uniplate: https://hackage.haskell.org/package/uniplate

Following @amalloy's example, here's how you'd code the same using Uniplate:

{-# LANGUAGE DeriveDataTypeable #-}

import Data.Data
import qualified Data.Generics.Uniplate.Data as G

data Lit       = S String | I Int            deriving Data
data Expr      = Sum Expr Expr | LitExpr Lit deriving Data
data Statement = Stmt [Expr]                 deriving Data
data AST       = AST [Statement]             deriving Data

strings :: AST -> [String]
strings a = [s | S s <- G.universeBi a]

And then:

*Main> strings (AST [Stmt [], Stmt [LitExpr (S "x"), Sum (LitExpr (I 5)) (LitExpr (S "y")), LitExpr (S "z")]])
["x","y","z"]

Note that the definition of strings is literally a one-liner. So, all you need is one line of import, and one line for extracting all literals using uniplate's universeBi function, which is literally the expression:

   [s | S s <- G.universeBi ast]

which pretty much reads exactly what you intended: Traverse the entire AST, and find me all literal strings appearing anywhere. Doesn't get any more declarative than that!

  • Related