Home > Back-end >  How can Javascript's regular expressions filter strings with multiple rules at the same time?
How can Javascript's regular expressions filter strings with multiple rules at the same time?

Time:07-05

I need to process a piece of text into an array of words.

Delimiters between words are newlines, spaces, and various punctuation marks, and  .

The code I wrote was able to handle other cases, but not the   case.

Notice:I need to handle all cases within the same regex and cannot replace   with spaces.


This code doesn't go wrong, it just runs in chrome and the result is not the expected value.

In the generated word array, "break up test the words" is a value(wrong), I need it to be 5: [break,up,test,the,words](right)


my code:

<!DOCTYPE html><html><head>
<script>
window.onload = function(){
  var text = document.getElementById('text').textContent
  // &nbsp; of below regex doesn't work
  var word_array = text.split(/[ \t\n\r.?,"';:!(){}<>\/]|&nbsp;/)
  console.log(text)
  console.log(word_array)
}
</script>
</head><body>
<div id="text">this   is text,break&nbsp;up&nbsp;test&nbsp;the&nbsp;&nbsp;words!ok</div>
</body></html>

CodePudding user response:

The issue is that the regex sees &nbsp as those exact characters. You want to use '\xa0' instead.

  • Related