Home > Mobile >  How to handle a million documents in MongoDB using expressJS?
How to handle a million documents in MongoDB using expressJS?

Time:10-16

I have an "annotations" collection in MongoDB, which contains 5 million documents. The collection size is almost 2.5 GB, and its index size is 55 MB.

I was trying to store the collection in a variable

const Annotation = await database.collection("annotations").find().toArray();

But whenever I tried to run the application, it crashed by giving this error.

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory

 1: 00007FF6351E7C4F v8::internal::CodeObjectRegistry::~CodeObjectRegistry 114207
 2: 00007FF635175EC6 DSA_meth_get_flags 65542
 3: 00007FF635176D7D node::OnFatalError 301
 4: 00007FF635AAB6CE v8::Isolate::ReportExternalAllocationLimitReached 94
 5: 00007FF635A95CAD v8::SharedArrayBuffer::Externalize 781
 6: 00007FF63593907C v8::internal::Heap::EphemeronKeyWriteBarrierFromCode 1468
 7: 00007FF635945D29 v8::internal::Heap::PublishPendingAllocations 1129
 8: 00007FF635942CFA v8::internal::Heap::PageFlagsAreConsistent 2842
 9: 00007FF635935959 v8::internal::Heap::CollectGarbage 2137
10: 00007FF63593E21B v8::internal::Heap::GlobalSizeOfObjects 33111: 00007FF63598498B v8::internal::StackGuard::HandleInterrupts 891
12: 00007FF63568C3C6 v8::internal::DateCache::Weekday 803813: 00007FF635B393C1 v8::internal::SetupIsolateDelegate::SetupHeap 494417
14: 000001F1E83C5EC9

I tried to fix it by using this command but it still gives me this error.

$env:NODE_OPTIONS=--max-old-space-size=8192

Can someone suggest how I can easily manage a such large amount of dataset in my application?

CodePudding user response:

In your place, I would ask myself if I really need to get all this data and store in memory at once.

If you are going to process it somehow only some fields may be useful at a time, so you can project just those, or do the processing/analytics in database itself, as it would be usually much faster, as data does not need to move from it to your variable.

CodePudding user response:

The best way to deal with a big amount of data is to handle it as a stream, so you never need to load the entire set of data at once in memory. You can try something like this:

const cursor = database.collection("annotations").find();
for await (const annotation of cursor) {
    // do something with the annotation
}
  • Related