Home > Blockchain >  Perform a query removing special characters to detect duplicate documents
Perform a query removing special characters to detect duplicate documents

Time:12-13

I have a document collection that has a Business Model, it has a tin field, this tin is a brazilian type of document, and i have in the DB documents with different formats that include special characters

ex: {tin : '34.545.646/4245-24'} or {tin : '34,545,646/424524'} or {tin : '34545646/4245-24'}

i want to perform a query like that Business.findOne({tin: tin})

i want to remove all special characters to tin input. and no matter what format are in the database i can query with only the numbers.

how can i use query to see if a document exists no matter the format in db, only number matters\

how to remove special characters in query to look for a document

i want to know if a document exists no matter the format, only numbers

CodePudding user response:

This query is going to be slow, but here's one way to do it.

db.collection.find({
  "$expr": {
    "$eq": [
      // "filtered" number string goes here
      "34545646424524",
      { // strip everything except digits from tin
        "$reduce": {
          "input": {"$range": [0, {"$strLenCP": "$tin"}]},
          "initialValue": "",
          "in": {
            "$let": {
              "vars": {
                "char": {"$substrCP": ["$tin", "$$this", 1]}
              },
              "in": {
                "$cond": [
                  {
                    "$regexMatch": {
                      "input": "$$char",
                      "regex": "\\d"
                    }
                  },
                  {"$concat": ["$$value", "$$char"]},
                  "$$value"
                ]
              }
            }
          }
        }
      }
    ]
  }
})

Try it on mongoplayground.net.

  • Related