Home > Software engineering >  Best way to speed up PyMongo loop
Best way to speed up PyMongo loop

Time:09-14

I'm currently using a MongoDB database where I'm storing product data. I'm currently using a for loop of around ~50 IDs, and with each iteration, I'm searching for the ID and if the ID doesn't exist, I'm adding it, and if it exists and another column is a specific value, I'll run a function.

for id in ids:
  value = db.find_one({"value": id})
  if value:
    # It checks for some other columns here using both the ID and the return value
  else:
    # It adds the ID and some other information to the database

The problem here is that this is incredibly inefficient. When searching around for other ways to do this, all results show how to get a list of the results, but I'm not sure how this would be implemented in my scenario since I'm running functions and checks with each result and ID.

Thank you!

CodePudding user response:

You can improve by doing only one find request. And in a second time, add all the documents in DB. Maybe with an insert_many ?

value = db.find({"value": {"$in": ids}})
for value in values:
    # It checks for some other columns here using both the ID and the return
    ids.remove(value.id)

# Do all your inserts
# with a loop
for id in ids:
    df.insert(...)
# or with insert_many
db.insert_many(...)
  • Related