Home > Blockchain >  Good approach with MongoDB, comparing two fields to determine if value exists in both?
Good approach with MongoDB, comparing two fields to determine if value exists in both?

Time:01-06

example:

  • _id = 001

    • field 'location' = PARIS FRANCE

    • field 'country' = FRANCE

  • _id = 002

    • field 'location' = TORONTO

    • field 'country' = CANADA

desired result:

ability to recognize that for _id 001, "france" is also in the value for location field;

whereas, _id 002 does not have a value from country that also is in location

Instead of relying on pandas, would like to see if there are more efficient options using pymongo, for example?

CodePudding user response:

This is sensitive to case, and possible abbreviations, etc., but here's one way to identify if one string is contained within the other.

Given an example collection like this:

[
  {
    "_id": "001",
    "location": "PARIS FRANCE",
    "country": "FRANCE"
  },
  {
    "_id": "002",
    "location": "TORONTO",
    "country": "CANADA"
  }
]

This will set "isIn" if "country" is contained within "location" or vice-versa.

db.collection.aggregate([
  {
    "$set": {
      "isIn": {
        "$gte": [
          {
            "$sum": [
              { // returns pos or -1 if not found
                "$indexOfCP": ["$location", "$country"]
              },
              {"$indexOfCP": ["$country", "$location"]}
            ]
          },
          -1
        ]
      }
    }
  }
])

Example output:

[
  {
    "_id": "001",
    "country": "FRANCE",
    "isIn": true,
    "location": "PARIS FRANCE"
  },
  {
    "_id": "002",
    "country": "CANADA",
    "isIn": false,
    "location": "TORONTO"
  }
]

Try it on mongoplayground.net.

  • Related