Home > OS >  Why using .filter() together with .match() is only returning the first element matching the conditio
Why using .filter() together with .match() is only returning the first element matching the conditio

Time:07-03

I have some HTML code where at the most nested level there is some text I'm interested in:

<div >
  <div >
    
    <div >
      <pre>WHITE 34</pre>
    </div>
    <div >
      <pre>RED 05</pre>
    </div>

    <div >
      <pre>WHITE 16</pre>
    </div>
    <div >
      <pre>BLACK</pre>
    </div>
  
  </div>
</div>

What I need to do is I need to return the output_area elements only when their nested <PRE> element contains a word a number (for example WHITE 05, and not just BLACK).

So this is what I did:

I made an array from all output_area elements:

output_areas = Array.from(document.getElementsByClassName('output_area'));

I filtered the output_areas array to only return those output_area elements whose nested <PRE> satisfies my condition of a word a number, using a regexp, like so:

output_areas.filter(el => el.textContent.match(/^WHITE \d $/g));

Now, what happens is this function will only return the first matching result, so I will get an object of length 1 containing just :

<div >
      <pre>WHITE 34</pre>
</div>

and the output_area element containing <PRE> with "WHITE 16" is not returned.

As you can see at the end of the regular expression I put a "g" to request a global search and not just stop at the first result.

Not understanding why this did not work, I tried to verify what would happen if I would use includes() to perform a search:

output_areas.filter(el => el.textContent.includes('WHITE')

(let's just forget about the numbers now, it's not important)

And what happens? This will also return only the first output_area...

But why??? What am I doing wrong? I am not ashamed to say I've been banging my head on this for the last couple of hours... and at this point I just want to understand what is not working.

The only clue I think I got is that if I simplify my search using just a == or !=, for example:

output_areas.filter(el => el.textContent != "")) // return all not empty elements 

I get back all output_area elements and not just the first one!

So I suspect there must be some kind of problem when using together filter() & match(), or filter() & includes(), but with relation to that my google searches did not take me anywhere useful...

So I hope you can help!

CodePudding user response:

You should use trim here to remove space before and after the text

output_areas.filter( el => el.textContent.trim().match( /^WHITE \d $/g ))

const output_areas = Array.from(document.getElementsByClassName('output_area'));

const result = output_areas.filter(el => el.textContent.trim().match(/^WHITE \d $/g));
console.log(result);
<div >
  <div >

    <div >
      <pre> WHITE 34 </pre>
    </div>
    <div >
      <pre> RED 05 </pre>
    </div>

    <div >
      <pre> WHITE 16 </pre>
    </div>
    <div >
      <pre> BLACK </pre>
    </div>

  </div>
</div>

CodePudding user response:

Answering myself as for some reason it then begin to work without any changes from my side... Yes, just one of those typical IT cases we all know... :)

Jokes aside, I think for some reason the webpage (the DOM) got stuck... Probably the Jupyter Runtime (which was serving the page) had crashed without me noticing, and this caused somehow the kind of inconsistency I was looking at.

Moral of the story: if you see weird behaviour in the interaction with a Python Notebook, always go check the Jupyter Runtime status before getting stupid at trying to fix impossible errors.

CodePudding user response:

I'm not sure what the issue with the Jupyter notebooks is, but generally speaking - based only on the HTML in the question - what I believe you are trying to do can be achieved using xpath instead of css selectors:

html = `[your html above]
`
domdoc = new DOMParser().parseFromString(html, "text/html")

const areas = domdoc.evaluate('//div[contains(./pre," ")]', domdoc, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);
for (let i = 0; i < areas.snapshotLength; i  ) {
 console.log(areas.snapshotItem(i).outerHTML)  
}  

The output should be the 3 divs meeting the condition.

  • Related