Home > Net >  How can I convert an integer to a string in a document nested in an array in Pymongo?
How can I convert an integer to a string in a document nested in an array in Pymongo?

Time:09-21

I have a document, nested in an array in mongoDB. One of the document elements is an integer but I need to convert it to be a string. I have tried to use many different approaches to this, using both the update_one() and update_many() functions. I have also tried to use $toString, $set and $convert in different ways and I even found a post that tried to use the aggregate() function. None of my attempts seem to yield any results. Some of these attempts produce errors but for the most part it just seems like it's working until I check the data and nothing has changed.

Here is a sample of what the data looks like:

[
  {
    "_id": ObjectId("62d7e96b3348d2ed4d3f6c37"),
    "index": 2394,
    "hashDec": "17795682514039271424",
    "hashHex": "0xf6f6f6f600f0f000",
    "postList": [
      {
        "timestamp": 1659646945.456782,
        "uploadTime": 1659646903.0,
        "author": "Osinttechnical",
        "id": 1555297956483305472,
        "platform": "twitter",
        "text": ""
      },
      {
        "platform": "twitter",
        "id": 1567987802234851328,
        "author": "UAWeapons",
        "text": "#Ukraine: A Russian Ural-4320 transport truck had a a slight accident and was abandoned in #Kharkiv Oblast.",
        "timestamp": 1662672491.5861459,
        "uploadTime": 1662672398.0
      }
    ]
  }
]

and here's what would seem most logical to me:

self.allDocs = list(self.video.find({}).sort("index"))

        for index, doc in enumerate(self.allDocs):

            print(f"[{datetime.datetime.now()}] Updating doc: {doc['index']}")

            for post in doc["postList"]:

                if isinstance(post["id"], int):

                    print(f"[{datetime.datetime.now()}] Updating post: {post}")
                    self.video.update_one({"index": doc["index"],
                        "postList.$[].id": post["id"]},
                        {"$set": {"postList.$[].id": str(post["id"])}},
                        upsert=False)

Does anyone know how best to do this?

CodePudding user response:

I recommend you use update_many with the aggregation pipeline update feature, this allows you to execute a single update to achieve this without the overhead of reading all documents into memory like you're doing.

It will look like so:

self.video.update_many({},
[
  {
    "$set": {
      "postList": {
        "$map": {
          "input": "$postList",
          "in": {
            "$mergeObjects": [
              "$$this",
              {
                "id": {
                  "$toString": {
                    "$toLong": "$$this.id"
                  }
                }
              }
            ]
          }
        }
      }
    }
  }
])

Mongo Playground

  • Related