Home > Back-end >  Django Loop Through Objects After Bulk .update()
Django Loop Through Objects After Bulk .update()

Time:10-13

This code runs on cron. So I want to update the status of the objects immediately so that these objects don't get picked up again if a second cron starts before the current one finishes (which will eventually start to happen with my app.)

    # Grab all pending emails.
    emails = delivery_que.objects.filter(status='PENDING')
    emails.update(status='SENDING')

    # Loop through the pending emails.
    for email in emails:

The current code doesn't work, as I seem to no longer have access to the objects after I .update() them.

This is the workaround I implemented:

    # Grab all pending emails.
    emails = delivery_que.objects.filter(status='PENDING')
    emails.update(status='SENDING')
    emails = delivery_que.objects.filter(status='SENDING')

    # Loop through the pending emails.
    for email in emails:

Is there another better solution I'm missing? I'd prefer not to query the database again to reselect the objects that I should already have access to from the first query.

CodePudding user response:

The issue here is that you are running overlapping processes and neither knows what the other is doing at any point in time.

Simple solution:

  1. When the job starts, check for a lock record. If one exists, exit the job.
  2. Add a lock record (to a model in DB, touch a file, something).
  3. Process the job.
  4. Remove the lock record as the last thing the job does.

Slightly more complex solution:

  1. At the start of the job update all records from PENDING to a unique value for that process (eg. PROCESSING_<uuid>)
  2. Run your update for records with that unique value.

You could also do this by adding another field to the model, eg. processing_id and checking that is empty as well as having the correct PENDING status.

Probably the most complex solution:

  • Make the process idempotic ...

By this I mean that it doesn't matter if it runs twice on the same record. From the code you post above this actually is true, as all you are doing is chaning the status to SENDING, but I guess there is code you don't show and users get double emails.

If you have code to ensure that no double mail is sent, this would not be an issue - the object might get processed twice, but it would only send an email the first time.

In an ideal world, all processes like this should be idempotic anyway, just to ensure you never have any problems should somebody manage to run it when they shouldn't.

CodePudding user response:

Same issue here: how-to-bulk-update-with-django and Django Documentation here Bulk Update

  • Related