Home > database >  How to efficiently update items of a huge (100.000 items) array from a smaller (50 items) array?
How to efficiently update items of a huge (100.000 items) array from a smaller (50 items) array?

Time:12-22

I have two arrays like this:

let a = [{id: 1, content: 10},{id: 2, content: 20},{id: 3, content: 30}]
let b = [{id: 1, content: 11},{id: 2, content: 21}]

where a is actually huge (~100k objects) and b is about 50 objects, where all the ids in the objects in b are found in a. I want to merge these arrays together such that if an object in a has a given id and an object in b has that same id, the object in a gets replaced by the object in b. So {id: 1, content: 11} from b would replace {id: 1, content: 10} from a. The output in this example would be

[{id: 1, content: 11},{id: 2, content: 21},{id: 3, content: 30}]

where the first two objects got replaced, but the third didn't because it wasn't in b.

What I Tried:

let func = (a,b) => {
    let aa = a.reduce((acc,cur) => ({
        ...acc,
        [cur.id]: cur
    }), {})
    let bb = b.reduce((acc,cur) => ({
        ...acc,
        [cur.id]: cur
    }), {})
    return Object.keys({...aa,...bb}).map(key => mergedObj[key])
}

So I convert each array (a and b) into an object (aa and bb), indexed by its id, then merge the two objects, then convert them back to an array.

The Problem:

The line let aa = a.reduce ... takes a very long time.

Is there a more efficient approach to this problem?

CodePudding user response:

From my above comment ...

"... The performance killer is ... making excessive usage of rest parameter, spread syntax and destructuring assignement for each single iteration step. I suggest keeping it as simple as possible, thus a lookup based approach with references only since from the OP's example code there is no need for merging. It is more an updating replacement of existing item references."

And because of that I would choose a lookup based approach where one would ...

  1. create an object based lookup from the smaller source data-structure based on an item's id and

  2. and finally map the larger target structure by looking up whether to update/replace the current item based on such an item's id.

Thus for creating the lookup one would fully iterate the smaller array exactly once, and for creating the updated structure one would fully iterate the larger array, also exactly once, with no other additional costs than the object based lookup which contributes close to nothing.

And everything would be based on references instead of shallow copies or structured clones.

In addition one would implement both tasks as function statements in order to take a little bit more advantage of the JIT compiler's runtime optimization.

Further code based optimization could be done but should depend on the performance of the hereby provided implementation of the just described approach.

function aggregateIdBasedLookup(lookup, item) {
  lookup[item.id] = item;
  return lookup;
}
function updateItemFromBoundLookup(item) {
  return this[item.id] ?? item;
}

const largeTargetStructure = [
  { id: 1, content: 10 },
  { id: 2, content: 20 },
  { id: 3, content: 30 },
];
const smallerSourceStructure = [
  { id: 1, content: 11 },
  { id: 2, content: 21 },
];

const sourceLookup = smallerSourceStructure
  .reduce(aggregateIdBasedLookup, Object.create(null));

const largeUpdatedStructure = largeTargetStructure
  .map(updateItemFromBoundLookup, sourceLookup);

console.log({
  // largeTargetStructure,
  // smallerSourceStructure,
  // sourceLookup,
  largeUpdatedStructure,
});
.as-console-wrapper { min-height: 100%!important; top: 0; }

CodePudding user response:

You're going to have to iterate over the smaller array no matter what method you use. findIndex() returns the index of that element and stops iterating through the larger array. So, it seems like the simplest approach is:

let a = [{id: 1, content: 10},{id: 2, content: 20},{id: 3, content: 30}]
let b = [{id: 1, content: 11},{id: 2, content: 21}]

for (let i = 0; i < b.length; i  ) {
  let thisID = b[i].id;
  let thisContent = b[i].content;
  a[a.findIndex(x => x.id === thisID)].content = thisContent;
}

console.log(a)

  • Related