Home > OS >  How to get the number of unique words from a <textarea> in JavaScript?
How to get the number of unique words from a <textarea> in JavaScript?

Time:11-13

How can I get the number of unique words from a <textarea> in JavaScript?

Not sure how to approach this problem, I can seem to get all the different letters used but don't know how I would get all the unique words.

function uniqueWordCount(text) {
  var distinctWords = [];
  for (var i = 0; i < text.length; i  ) {
    if (distinctWords.indexOf(text[i]) === -1) {
      distinctWords.push(text[i]);
    }
  }

  alert(distinctWords);
  return distinctWords.length;
} 

CodePudding user response:

I would split the text and use a set to get unique words

const textarea = document.querySelector('textarea')
const count = document.querySelector('p')

handleInput()

textarea.addEventListener('input', handleInput)

function handleInput() {
  const words = textarea.value.split(/\s /g)
  const uniqueWords = new Set(words)
  count.textContent = uniqueWords.size
}
<textarea>count unique words</textarea>
<p></p>

CodePudding user response:

In standard javascript, transform your string into an array and get number of lines of the array.

You can directly use length of the split result.

The ternary operator is important to manage empty string because an empty string will give you a length of 1 and not 0 after splitting.

// Return the number of token of an Input string based upon a Separator
function CountToken(Input = "", Separator = ' ') {
  return Input == "" ? 0 : Input.split(Separator).length;
}

console.log(CountToken());                        // Return 0
console.log(CountToken("test"));                  // Return 1
console.log(CountToken("test now"));              // Return 2
console.log(CountToken("06-33-545-322-33", '-')); // Return 5

CodePudding user response:

You can read the content of a <textarea> via its value property:

const textArea = document.querySelector('textarea')
const content = textArea.value

You can find all properties of the HTMLTextAreaElement at MDN.

The number of unique words depends on your definition of "word". A simple definition would be "a group of contiguous letters". This can be achieved via the regular expression /\p{L} /gu:

const words = content.match(/\p{L} /gu)

String.match() returns all matches in an array (if the global flag is set). \p{L} is a Unicode property escape, enabled via the unicode flag.

Now you need to remove the duplicates. You don't need to do this yourself. A Set can only contain unique values by its definition:

const uniqueWords = new Set(words)
const numberOfUniqueWords = uniqueWords.size
  • Related