Home > Mobile >  Issue with recursive function converting dom element to json
Issue with recursive function converting dom element to json

Time:12-15

I have a recursivce function that takes a dom tree and converts it to JSON.

However I want to exclude any nodes that have a specific data attribute data-exclude

const htmlToJSON = node => {
  const exclude = node.attributes?.getNamedItem('data-exclude');
  if (!exclude) {
    let obj = {
      nodeType: node.nodeType
    };
    if (node.tagName) {
      obj.tagName = node.tagName.toLowerCase();
    } else if (node.nodeName) {
      obj.nodeName = node.nodeName;
    }
    if (node.nodeValue) {
      obj.nodeValue = node.nodeValue;
    }
    let attrs = node.attributes;
    if (attrs) {
      length = attrs.length;
      const arr = (obj.attributes = new Array(length));
      for (let i = 0; i < length; i  ) {
        const attr = attrs[i];
        arr[i] = [attr.nodeName, attr.nodeValue];
      }
    }

    let childNodes = node.childNodes;
    if (childNodes && childNodes.length) {
      let arr = (obj.childNodes = []);
      for (let i = 0; i < childNodes.length; i  ) {
        arr[i] = htmlToJSON(childNodes[i]);
      }
    }
    return obj;
  }
};

const parser = new DOMParser();
const { body } = parser.parseFromString(page, 'text/html');

let jsonOutput = htmlToJSON(body);
console.log(jsonOutput);

I am clearly missing something with the way I am excluding because when I log the results it is returning undefined instead of just excluding it.

enter image description here

CodePudding user response:

Did not execute the code. As far as I can see htmlToJSON will return obj or undefined. If exclude is truthy, the function will return undef, thats what you are seeing.

Change your for loop:

for (let i = 0, temp; i < childNodes.length; i  ) {
   temp = htmlToJSON(childNodes[i]);
   temp && (arr[i] = temp);
}

that way you make sure if temp is defined you assign, otherwise not. Another option is to use Array.prototype.filter on the resultant array.

CodePudding user response:

It looks like you are checking if the data-exclude attribute exists on the node object and then not doing anything with it. This means that the code after the if (!exclude) statement will always be executed, regardless of whether the data-exclude attribute is present or not.

To fix this, you can move the code inside the if (!exclude) block into a separate function and call that function only if the data-exclude attribute does not exist. This way, the code will only be executed for nodes that do not have the data-exclude attribute.

CodePudding user response:

It's most likely because you're not returning anything from htmlToJSON in the case of "exclude == true". Notice how your lambda function doesn't have a "return " in that case. So the function will by default return "undefined."

And if you fill an array element with "undefined" it becomes a sparse array. So those elements in the array with "undefined" values become interpreted as "empty" slots by console.log() when printing the contents of any array to the console.

Update: I tried your code and, yup, my explanation above is correct. However, if you don't care about implicitly returning undefined from your htmlToJSON(), then you can just modify your inner for loop:

    for (let i = 0; i < childNodes.length; i  ) {
      let json = htmlToJSON(childNodes[i]);
      json && arr.push(json);
    }

This way, only if json is truthy, will it add an element to the arr array.

I tried this code with your original function, and also with a modified version that returns null if exclude == true, and both ways worked.

  • Related