Trying to process a snapshot from a database, we have over 2.5 million objects in the array, each object has an index property that is based on increments starting at an arbitrary number, in this case 45000
.
Using a sort method myArray.sort((a,b)=> a.index - b.index)
leaves the array fragmented.
example:
- dataset index starts at 45000
myArray[0]
correctly logs 45000myArray[myArray.length-1]
correctly logs 2545000myArray[1]
is incorrectly 45007
I thought the data was missing from the snapshot, but confirmed through:
myArray.findIndex(e => e.index == 45001)
reports index 12532
and the value is present in the serialized json file that is 1.4GB in size.
I have serialized with a read/write stream with each line containing the JSON.stringified object
Should I move to a Collection Map instead of an array? would .get()
be efficient?
I am currently using a For loop and iterating by index
then finding the actual index to ensure they are incremental but its by far the slowest method.
for (let i = 0; i < maxIndex- minIndex; i ) {
let obj = myArray.find(e => e.index == i minIndex)
if(!obj){
console.log("missing index", i minIndex);
continue;
}
// process object
}
CodePudding user response:
There appears to be some technical limitations with the .sort()
functions on larger arrays. if you know the data set is large in volume, its easier to cast to a new array
let newArray = new Array(maxIndex - minIndex 1).fill(null);
myArray.forEach(e => newArray [e.index - maxIndex ] = e);
for (let i = 0; i < newArray .length; i ) {}