Is there any way to find a partial string in an array more quickly than I have below?
Here is my example data:
const products = [
{
"product": "test1",
"price": 11
},
{
"product": "test2",
"price": 31
},
{
"product": "xxxx",
"price": 21
},
{
"product": "ssss",
"price": 22
},
]
Here are my keywords (but in reality, there are a lot more):
const keywords = [ "test", "xx" ]
I want to filter products with keywords then sum all product price my output should be
63
Here is what I tried to do: I filter my products first, I'm using indexOf
because it faster than includes
:
const fil = _.filter(products, (product) => {
return _.some(keywords, (v) => product.name.indexOf(v) >= 0 );
});
then I sum them using reduce
:
const sum = fil.reduce(function (sum, data) {
return sum data.price;
}, 0);
Everything works well but if I have to work with around 300k elements and 100k keywords, it takes around 3 min to find this query. Is there any way to decrease that time? (The product
values are very unique, there aren't a lot of duplicates.)
CodePudding user response:
You've said you only need the sum, not the list of filtered products, so we can reduce that time slightly, but probably not by a lot, by:
- Reducing the amount of memory churn by not producing an array we don't need
- Not making two passes through
products
when we only need one. - Avoiding function calls (although function calls are fast in JavaScript engines)
So:
let sum = 0;
for (const {product, price} of products) {
for (const keyword of keywords) {
if (product.includes(keyword)) {
sum = price;
break;
}
}
}
That uses the new(ish) for-of
loop. You might want to try it against a classic for
loop to see which gives you the best speed:
let sum = 0;
let productsLength = products.length;
let keywordsLength = keywords.length;
let pIndex, kIndex;
for (pIndex = 0; pIndex < productsLength; pIndex) {
const {product, price} = products[pIndex];
for (kIndex = 0; kIndex < keywordsLength; kIndex) {
if (product.includes(keywords[kIndex])) {
sum = price;
break;
}
}
}
Similarly, experiment with putting the loop variables inside the for (...)
part vs. having them outside as above, but I'd tend to think outside would be just that tiny bit faster (because of how let
works in for
loops).
And finally, you might compare that against using the native (not lodash) forEach
and some
methods rather than for
loops.