Home > Net >  MongoDB weird performance in Lookup
MongoDB weird performance in Lookup

Time:11-30

I want to make two lookups, every one for a different collection

the first lookup

{ $lookup: { from: 'collectionOne',let: { id: '$_id' }, pipeline: [
          { $match: { $expr: {  $and: [ {$in: ['$$id', '$users.id']}, {$eq: ['$status', 'success']}]}}},
          { $project: {_id: 1}},
], as: 'ones'}},

the second lookup

{ $lookup: { from: 'collectionTwo',let: { id: '$_id' }, pipeline: [
          { $match: { $expr: {  $and: [ {$in: ['$$id', '$users.id']}, {$eq: ['$status', 'success']}]}}},
          { $project: {_id: 1}},
], as: 'twos'}},
  • if I add the first lookup only, the query time will be 128ms
  • if I add the second lookup only, the query time will be 134ms
  • if I add the both lookups in same aggregate pipeline, the query time will be 28.6s which is horrible performance.

Note: all the field status and ids are indexed well and that is why the performance very well when running only one lookup, but running the two lookup are very slow like it is ignoring the indexes.

Thanks

CodePudding user response:

so a couple things you can try of the top of my head:

  • leaving localfield, foreignfield empty results in $lookup trying to match for null. (depending on your mongoDB version, which you did not mention)
  • changing up the order of match and lookup should be preferred, because you decrease the amount of items you have to lookup.
  • the collections should be in the same collation, or the indexes will not be used.

CodePudding user response:

Some information about $lookup
(if something isn't true comment below if you can)

  1. lookup needs index in the from collection
  2. that index can be used in whatever place in the pipeline we are
  3. lookup kills the index for the next stages, only for the collection that is on the pipeline.
  4. $expr cant use the index expect in $eq, $lt, $lte, $gt, and $gte

For your case if you have indexes on the from collection they can be used fine, but the problem is the $expr with $in

Solution can be to do a normal lookup(not pipeline) join on field with an array (its like the query $in) and index can be used.

I mean something like

{
  "$lookup": {
    "from": "collectionOne",
    "localField": "_id",
    "foreignField": "users._id",
    "as": "joinOne"
  }
}
{
  "$lookup": {
    "from": "collectionTwo",
    "localField": "_id",
    "foreignField": "users._id",
    "as": "joinTwo"
  }
}
match ...
project ...

This way if you have indexes on users._id it will be used in both lookups. If you test it send some feedback if you can, if it worked faster.

  • Related