Home > Software engineering >  How to handle different nodes by type in a parse tree in TypeScript?
How to handle different nodes by type in a parse tree in TypeScript?

Time:12-19

I have a lexer and parser which I am trying to convert to TypeScript. You can see the full code (JavaScript-only now) here. I tried to boil it down to a simpler example, and I think I might demonstrate the point with this example pseudocode:

type X = {
  type: string
}

type A = X & {
  list: Array<A | B | C>
}

type B = X & {
  value: A | B | number
}

type C = X & {
  value: string
}

const a1: A = { type: 'a', list: [] }
const a2: A = { type: 'a', list: [] }
const a3: A = { type: 'a', list: [] }

const b1: B = { type: 'b', value: a1 }
const b2: B = { type: 'b', value: b1 }
const b3: B = { type: 'b', value: 200 }

const c1: C = { type: 'c', value: 'foo' }
const c2: C = { type: 'c', value: 'bar' }
const c3: C = { type: 'c', value: 'baz' }

a1.list.push(b1, a2, b2, b3)
a2.list.push(a1, a3, b3, c1)
a3.list.push(b2, c2, c3)

const x = { type: 'a', list: [a1, a2, a3] }

handle(x)

function handle(x: A | B) {
  if (x.type === 'a') {
    x.list.forEach(handle)
  } else if (x.type === 'b') {
    if (typeof x.value === 'number') {
      console.log(x.value)
    } else {
      handle(x.value)
    }
  } else { // c
    console.log(x.value)
  }
}

Notice there is no typing used at all in that handle function, I don't know how to do it there....

You'll notice in the parser code (also in there is a lexer in a separate module), there is stuff like this:

while (i < list.length) {
  const token = list[i  ]
  // console.log(token.form, stack)
  switch (token.form) {
    case `term-open`: {
      const node = stack[stack.length - 1]
      const term = {
        form: 'term',
        link: []
      }
      node.leaf.push(term)
      stack.push(term)
      break
    }
    case `term-close`: {
      stack.pop()
      break
    }
    case `open-parenthesis`: {
      const node = stack[stack.length - 1]
      const site = {
        form: 'site',
        leaf: [],
        site: []
      }
      node.site.push(site)
      stack.push(site)
      break
    }
    case `close-parenthesis`: {
      stack.pop()
      break
    }
  }

Key thing being const node = stack[stack.length - 1]. This could be any node depending on the context, similar how in my pseudocode we don't know what node we are going to expect.

How do you handle this sort of situation in TypeScript? I can easily do const node = stack[stack.length - 1] as SomeType, but that is hacking it and doesn't seem safe. Or is that the correct approach, I'm not too sure?

How would you correctly iterate through the pseudocode (TypeScript above) and log the leaf values, so everything is properly typed?

CodePudding user response:

The "correct" way to do this, or at least the way best supported by TypeScript, is to make your base node type X a discriminated union in which the type property has a string literal type that can be used to distinguish the union member from the other ones. For your example code that could look like:

interface A {
    type: 'a'; // <-- literal type
    list: Array<A | B | C>
}

interface B {
    type: 'b'; // <-- literal type
    value: A | B | number
}

interface C {
    type: 'c'; // <-- literal type
    value: string
}

type X = A | B | C; // <-- discriminated union

const x: X = { type: 'a', list: [a1, a2, a3] } // okay

And then your handle() function compiles as-is without error:

function handle(x: X) {
    if (x.type === 'a') {
        x.list.forEach(handle)
    } else if (x.type === 'b') {
        if (typeof x.value === 'number') {
            console.log(x.value)
        } else {
            handle(x.value)
        }
    } else { // c
        console.log(x.value)
    }
}

That all works because your tests like x.type === "a" allow the compiler to narrow the apparent type of x. So in a code block where x.type === "a" is known to be true, then x is known to be of type A.

Playground link to code

  • Related