I have some data in a mongodb database collection that looks like this
{
"_id": {
"$oid": "63737b4b654d9b6a0c3a2006"
},
"tag": {
"tagName": 0.10534846782684326
}
}
and I want to check if a dictionary with a specific tagName
exists. To do so, we can apply this
mycollection.find({f"tag.{'tagName'}": {"$exists": True}})
However, some tagNames
have a dot .
in front, e.g.,
{
"_id": {
"$oid": "63737b4b654d9b6a0c3a2006"
},
"tag": {
".tagName": 0.10534846782684326
}
}
So when I run the query
mycollection.find({f"tag.{'.tagName'}": {"$exists": True}})
returns that the dictionary whose key name is .tagName
is not found. This is because of the double dot in f"tag.{'.tagName'}"
. Can we write the query in such a way in order to avoid this situation?
Mongodb version:
db version v4.4.13
Build Info: {
"version": "4.4.13",
"gitVersion": "df25c71b8674a78e17468f48bcda5285decb9246",
"openSSLVersion": "OpenSSL 1.1.1f 31 Mar 2020",
"modules": [],
"allocator": "tcmalloc",
"environment": {
"distmod": "ubuntu2004",
"distarch": "x86_64",
"target_arch": "x86_64"
}
}
CodePudding user response:
The first syntax looks a little odd to me. I don't think it should have the curly brackets. You can see in this playground example that it doesn't find the first document. So you may be looking to remove the curly brackets from the query in both situations, and here is an example where doing so correctly returns the first document.
Now regarding the .
character in the name, one approach would be to use the $getField
operator. That operator helps retrieve field names that are otherwise ambiguous or contain special characters. An example (that would only retrieve the second document) might look like this:
db.collection.find({
$expr: {
$ifNull: [
{
$getField: {
field: ".tagName",
input: "$tag"
}
},
false
]
}
})
You may combine the two conditions with a $or
to return both documents, playground example here.
I would recommend updating your data to remove the extra .
character. Its presence is going to make working with the data more difficult and probably cause some performance issues since many of the operations won't be able to effectively use indexes.
Version 4.4 and earlier
As noted in the comments, the $getField
operator is new in version 5.0
. To accomplish something similar prior to that you could use the $objectToArray
operator.
Effectively what you will do here is convert $tag
to an array of k
, v
pairs where k
contains the field name. You can then filter directly against that name (k
) looking for the value(s) of interest.
The verbose, but arguably more readable, approach to doing so looks like this:
db.collection.aggregate([
{
"$addFields": {
"tagNames": {
"$objectToArray": "$tag"
}
}
},
{
$match: {
"tagNames.k": {
$in: [
"tagName",
".tagName"
]
}
}
},
{
$project: {
tagNames: 0
}
}
])
You could probably collapse it down and do it directly in find()
(via $expr
usage), as demonstrated here. But doing so requires a little more knowledge about your schema and the structure of the tag
field. Overall though, working with field names that contain dots is even more difficult prior to 5.0
, which further strengthens the suggestion to correct the underlying data.