How to reduce execution time in this mongo db find query?-CodePudding

document sample data followed like this,

{
"_id" : ObjectId("62317ae9d007af22f984c0b5"),
"productCategoryName" : "Product category 1",
"productCategoryDescription" : "Description about product category 1",
"productCategoryIcon" : "abcd.svg",
"status" : true,
"productCategoryUnits" : [ 
    {
        "unitId" : ObjectId("61fa5c1273a4aae8d89e13c9"),
        "unitName" : "kilogram",
        "unitSymbol" : "kg",
        "_id" : ObjectId("622715a33c8239255df084e4")
    }
],
"productCategorySizes" : [ 
    {
        "unitId" : ObjectId("61fa5c1273a4aae8d89e13c9"),
        "unitName" : "kilogram",
        "unitSize" : 10,
        "unitSymbol" : "kg",
        "_id" : ObjectId("622715a33c8239255df084e3")
    }
],
"attributes" : [ 
    {
        "attributeId" : ObjectId("62136ed38a35a8b4e195ccf4"),
        "attributeName" : "Country of Origin",            
        "attributeOptions" : [],
        "isRequired" : true,
        "_id" : ObjectId("622715ba3c8239255df084f8")
    }
]
}

This collection has been indexed in "_id". without sub-documents execution time is reduced but all document fields are required.

db.getCollection('product_categories').find({})

This collection contains 30000 records and this query takes more than 30 seconds to execute. so how to solve this issue. Anybody ask me a better solution. Thanks.

CodePudding user response：

Indexing and compound indexing will make it use cache instead of scanning document every time you query it. 30.000 documents is nothing to MongoDB, it can handle millions in a second. If these fields are populated in the process that's another heavy operation for the query.

See if your schema is efficiently structured or you're throttling your connection to the server. Other thing to consider is to project only the fields that you require, using aggregation pipeline.

CodePudding user response：

Although the question is not very clear you can follow this article for some best practices.

CodePudding user response：

You can paralelize this query for better results , for example take max and min _id range , split the range and execute let say 20 paralel requests as follow:

Imagine _id = [1..30000]

Query1: Find({ _id:{$gt:1 , $lt:1500 }})

Query2: Find({ _id:{$gt:1500 , $lt:3000 }})

...

Query20: Find({ _id:{$gt:28500 , $lt:30000 }})

This will be much more faster then just fetching single cursor query without using indexes ...