Home > Software design >  Is there a way to make mongodb to use index, which may not quite fit, but lets sort results blockwis
Is there a way to make mongodb to use index, which may not quite fit, but lets sort results blockwis

Time:09-23

I have a collection test, and a compound index on it with two fields

db.test.createIndex({ i: 1, j: 1 })

When I execute following pipeline

db.test.aggregate([{ $sort: { i: 1, j: 1 } }], { allowDiskUse: false })

it works fine. But this pipeline

db.test.aggregate([{ $sort: { i: 1, j: -1 } }], { allowDiskUse: false })

fails with the error that says "Sort exceeded memory limit". The reason is more less clear. The sort order in the pipeline does not match the order in the index and therefore mongodb decides not to use the index and sort the whole collection, which, in turn, does not fit in memory.

However I suspect that mongodb could be slightly smarter. Instead of sorting the whole collection it could use the index to delimit blocks of documents, for which field i is the same, and then sort documents only within such blocks. The documents of the same block have more chances to fit in memory and therefore the pipeline can perform more efficiently. Can I make mongodb server do so? How? If not, what prevents this.

CodePudding user response:

It seems mongod do not identify that can use the index , but you can try to hint him as follow:

db.test.aggregate([ {$sort:{i:1,j:-1}} ],{hint:"i_1_j_1"})

CodePudding user response:

A similar question was asked a few days later here. As @Tom Slabbaert mentioned in the comments, the answer is that no, at the time of writing, MongoDB does not appear to support using the index in the situation described to provide an incremental sort. There is no (non-hacky) way to force the system to do this, especially in a way that would be flexible and deliver performance benefits.

Some additional things to consider with respect to the presumed goal of improved performance:

  • What's the end result of what you're trying to achieve here? Is there a particular reason that would the compound sort is necessary and/or that the index couldn't be adjusted (to have j in descending order to allow it to support the sort)?
  • The sample pipelines explicitly have allowDiskUse set to false. Is there a reason for that? Setting it to true should allow the operation to complete successfully.
  • Relatedly, allowDiskUse now defaults to true beginning in version 6.0.
  • Related