I'm writing a function to find the 10 most common words in a string. However, when I go to sort my arr it repeats some of the words for their values of count.
paragraph = `I love teaching. If you do not love teaching what else can you love. I love Python if you do not love something which can give you all the capabilities to develop an application what else can you love.`;
const tenMostFrequentWords = (str) => {
const regex = /\b[a-z] \b/gi;
const arr = str.match(regex);
const set = new Set();
for (word of arr) {
const filteredArr = arr.filter(item => item == word);
set.add({word: word, count: filteredArr.length});
}
const newArr = Array.from(set);
newArr.sort((a,b) => b.count - a.count);
return newArr;
}
console.log(tenMostFrequentWords(paragraph));
Why is this happening?
CodePudding user response:
You're adding new objects to the set in every iteration of your loop. A Set
will compare them by reference (object identity), not by structural equality, so you're adding each word multiple times. Instead, use a Map
for the counts by word (and don't use filter
for counting, that amounts to quadratic complexity):
const tenMostFrequentWords = (str) => {
const regex = /\b[a-z] \b/gi;
const words = str.match(regex);
const counts = new Map();
for (word of words) {
counts.set(word, (counts.get(word) ?? 0) 1);
}
const newArr = Array.from(counts, ([word, count]) => ({word, count}));
newArr.sort((a,b) => b.count - a.count);
return newArr.slice(0, 10);
}
const paragraph = `I love teaching. If you do not love teaching what else can you love. I love Python if you do not love something which can give you all the capabilities to develop an application what else can you love.`;
console.log(tenMostFrequentWords(paragraph));
CodePudding user response:
You iterate through arr and word "love" is in arr 6 times, so it will add it to set 6 times. Create another "arr" and every time you iterate check if the word has already been iterated over.