Home > Mobile >  Mongodb Index behavior different in PROD and Testing environment
Mongodb Index behavior different in PROD and Testing environment

Time:10-20

I have 2 dedicated Mongo clusters which have the same exact Model, Indexes and we query both envs the same way but the result is different.

user.model.js

const schema = mongoose.Schema({
  _id: ObjectId,
  role: {
    type: String,
    enum: ['user', 'admin'],
    required: true,
  },
  score: { type: Number, default: 0 },
  deactivated: { type: Date },
});

schema.index(
  { deactivated: 1, role: 1, score: -1 },
  { name: 'search_index', collation: { locale: 'en', strength: 2 } }
);

I noticed that one of our common queries was causing issues on the PROD environment. The query looks like this:

db.getCollection('users')
    .find({deactivated: null, role: 'user'})
    .sort({score: -1})
    .limit(10)
    .collation({locale: 'en', strength: 2})

On the Testing Environment the query runs as expected fully utilizing the index. (has ~80K records total, 1300 deactivated) enter image description here

But in our PROD env the query, seems to be using only the first part of the compound index. (has ~50K records total, ~20K records deactivated) The executionStats looks like: enter image description here

As we can see it is using at least the first part of the index to only search in non-deactivated records, but the SORT is in memory.

This is a legacy application so the first thing I did was ensure that the types of the indexed fields are following the schema in all the records. I wonder if it could be the "role" collation somehow?

Any hint or clue will be greatly appreciated. Thanks in advance.

CodePudding user response:

Thanks for providing the plans. It is a combination of a few things (including the multikeyness of the production index) that is causing the problem.

There are a few ways to potentially solve this, let's start with the obvious question. Is score supposed to be an array?

The schema suggests not. With MongoDB, an index becomes multikey once a single document is inserted that has an array (even empty) for a key in the index. There is no way to way to "undo" this change apart from rebuilding the index. If the field is not supposed to contain an array, then I would suggest fixing any documents that contain the incorrect data and then rebuilding the index. As this is production, you may want to build a temporary index to reduce the impact to the application while the original index is dropped and recreated. You may also want to look into schema validation to help prevent incorrect data from getting inserted in the future.

If score can be an array, then we'll need to take a different approach. We can see in the UAT plan that a SORT_MERGE is used. The only reason that stage is required is because {"deactivated" : null} seems to have an additional index bound looking for undefined. That may be some internal implementation quirk as that BSON type appears to be deprecated. So updating the data to have an explicit false value for this field and using that check in the query predicate (rather than a check for null) will remove the need to split the plan out with a SORT_MERGE and will probably allow the multikey index to provide the sort:

    winningPlan: {
      stage: 'LIMIT',
      limitAmount: 10,
      inputStage: {
        stage: 'FETCH',
        inputStage: {
          stage: 'IXSCAN',
          keyPattern: { deactivated: 1, role: 1, score: -1 },
          indexName: 'search_index',
          collation: {
            locale: 'en',
            caseLevel: false,
            caseFirst: 'off',
            strength: 2,
            numericOrdering: false,
            alternate: 'non-ignorable',
            maxVariable: 'punct',
            normalization: false,
            backwards: false,
            version: '57.1'
          },
          isMultiKey: true,
          multiKeyPaths: { deactivated: [], role: [], score: [ 'score' ] },
          isUnique: false,
          isSparse: false,
          isPartial: false,
          indexVersion: 2,
          direction: 'forward',
          indexBounds: {
            deactivated: [ '[false, false]' ],
            role: [
              '[CollationKey(0x514d314b0108), CollationKey(0x514d314b0108)]'
            ],
            score: [ '[MaxKey, MinKey]' ]
          }
        }
      }
    }
  • Related