Use MongoDB field value to fill in separate blank field-CodePudding

Following the guidance here, I am using Mongo 4.4 to try and find all documents where field_A is null and fill field_A with the value of field_B.

I tried:

db.collection.updateMany({field_A:{$nin:[null,""]}},[{"$set": {field_A : "$field_B" }}])

This returns:

{ "acknowledged" : true, "matchedCount" : 176394, "modifiedCount" : 9952 }

I don't understand why most of them aren't modifying . . .?

Despite the warnings against it, I also tried running this without the square brackets in the second half of the query:

db.collection.updateMany({field_A:{$nin:[null,""]}},{"$set": {field_A : "$field_B" }})

But that just replaces field_A with the literal phrase "$field_B".

Can anyone explain what I'm doing wrong? Thanks!

UPDATE: Per @prasad_'s suggestion, I tried:

db.collection.updateMany({field_A:null},[{"$set": {field_A : "$field_B" }}])

This worked fine. All matched documents updated.

However, I still have documents where field_A is not truly null but rather "". I tried:

db.collection.updateMany({field_A:""},[{"$set": {field_A : "$field_B" }}])

And I got:

{ "acknowledged" : true, "matchedCount" : 2815, "modifiedCount" : 0 }

I just can't figure out why the "" won't update to the value of field_B.

CodePudding user response：

There was a lot of discussion in the comments that helped clear up the confusion between the $nin and $in operators, the latter seemingly being what is desired in this situation. There's a separate piece of confusion about matching versus modifying that I mentioned here and will expand on in this answer.

I just can't figure out why the "" won't update to the value of field_B.

In short, the value of field_B appears to be "" for those 2815 matching documents. Since that is also already the value of field_A there is nothing for the database to change.

The output you got from the database is:

{ "acknowledged" : true, "matchedCount" : 2815, "modifiedCount" : 0 }

Referring to the documentation, the matchedCount field represents the following:

The number of documents selected for update. If the update operation results in no change to the document, e.g. $set expression updates the value to the current value, nMatched can be greater than nModified.

And the modifiedCount field is:

The number of existing documents updated. If the update/replacement operation results in no change to the document, such as setting the value of the field to its current value, nModified can be less than nMatched.

Both of those definitions already state that the database will avoid modifying the document (and incrementing the counter) if the update would not change the document from its current state.

Consider a collection with the following three documents:

> db.collection.find()
[
  { _id: 1, field_A: 123, field_B: 456 },
  { _id: 2, field_A: '', field_B: 'ABC' },
  { _id: 3, field_A: '', field_B: '' }
]

When we execute your update (in the mongosh shell), we receive the following response:

> db.collection.updateMany({field_A:""},[{"$set": {field_A : "$field_B" }}])
{
  acknowledged: true,
  insertedId: null,
  matchedCount: 2,
  modifiedCount: 1,
  upsertedCount: 0
}

Here we have matchedCount: 2. There were 2 documents that matched the query predicates of {field_A: ""}. That would be the documents with _id:2 and _id:3.

But we only have modifiedCount: 1. Only one of the documents (id:2) needed to change. The other document (id:3) already had the same value for field_A and field_B (an empty string) so applying the update wouldn't change the document. After the operation completes, the documents now look as follows with the second one having been modified:

> db.collection.find()
[
  { _id: 1, field_A: 123, field_B: 456 },
  { _id: 2, field_A: 'ABC', field_B: 'ABC' },
  { _id: 3, field_A: '', field_B: '' }
]

If we run the update again, we get:

> db.collection.updateMany({field_A:""},[{"$set": {field_A : "$field_B" }}])
{
  acknowledged: true,
  insertedId: null,
  matchedCount: 1,
  modifiedCount: 0,
  upsertedCount: 0
}

This time the second document no longer matched (since we updated field_A to no longer be an empty string). However, the third document still matches but still does not get modified because field_A and field_B are both empty strings. Here are some related commands that help confirm that:

> //Count the number of documents where both `field_A` and `field_B` are empty strings:
> db.collection.countDocuments({field_A:'', field_B:''})
1
> //Change the filter of the update to exclude documents where `field_B` is an empty string:
> db.collection.updateMany({field_A:"", field_B:{$ne:""}},[{"$set": {field_A : "$field_B" }}])
{
  acknowledged: true,
  insertedId: null,
  matchedCount: 0,
  modifiedCount: 0,
  upsertedCount: 0
}

I would expect the count above to return 2815 in your environment. If you do not expect field_B to contain an empty string for those documents then you may need to take additional actions to correct that.