Home > Software engineering >  Comparing very large arrays of objects, looking for a performance boost
Comparing very large arrays of objects, looking for a performance boost

Time:03-16

I'm using this method to compare two very large arrays of objects to create a result array.

These arrays can have from 5 to 250,000 objects, each object has 8 properties. The properties are all strings (up to 256 chars) and integers.

It takes over 15 minutes to create the result array with 250,000 objects.

Realizing that the computer, memory, phase of the moon, etc, are all factors, I was looking for a faster/better way.

Any advice appreciated :)

const arr1 = [{prop1: "foo", prop2: 100},{prop1: "bar", prop2: 101}] // 250,000 objects
const arr2 = [{prop1: "foo", prop2: 50},{prop1: "bar", prop2: 51}] // 250,000 objects
let final = []

for(let left of arr1) {
        for(let right of arr2) {
            if(left.prop1 === right.prop1 && left.prop2 < right.prop2) {
                final.push(left)
                break
            }
        }
    }
}

CodePudding user response:

An approach that'll consume more memory but take less time to run will be to turn one of the arrays into an object or Map indexed by prop1 instead. That way, rather than iterating over each element to see if it matches, just look up the elements element for that prop1.

const arr2ByProp1 = new Map();
for (const item of arr2) {
    arr2ByProp1.set(item.prop1, item);
}
for (const item1 of arr1) {
    const item2 = arr2ByProp1.get(item1.prop1);
    if (item2 && item1.prop2 < item2.prop2) {
        final.push(item1);
    }
}

You could also tweak the above to use for(let i = 0, length = arr1.length; i < length; i ) { instead (slightly faster, but slightly harder to read and understand at a glance). Other minor tweaks might be done, but they shouldn't have much effect in comparison to the time complexity improvement.

Having 250k items in memory to begin with is suspect, though. If they were in a database instead, it'd probably be a lot faster faster to query the database (which can find by .prop1 and .prop2) to find these matches than to do the manipulations in JavaScript (which requires restructuring one of the whole data structures first, which isn't so efficient). A database may also be able to take advantage of parallel execution, unlike this particular situation of JavaScript.

  • Related