Home > Back-end >  Modify instance before it's saved on import
Modify instance before it's saved on import

Time:03-30

I'm setting up an import process for a model which has a ForeignKey relationship, and I've got an extra field on the model where I'd like to store the provided value if it's an invalid ID of the related table. So if a Passport object doesn't exist for the UUID in the imported file, the value in the file gets added to the invalid_passport field. This way it can be viewed via django admin and potentially used after the import process is complete.

The ID's in question here are UUIDs, so there's a good chance that someone along the line will provide some invalid data.

class Result(models.Model):

    id = models.UUIDField(
        primary_key=True,
        default=uuid.uuid4,
        editable=False
    )
    passport = models.ForeignKey(
        to='results.Passport',
        null=True,
        blank=True,
        on_delete=models.SET_NULL
    )
    invalid_passport = models.CharField(
        max_length=255,
        blank=True,
        null=True,
        help_text=_("An invalid passport number")
    )

The import file will contain a passport column for the UUID of the related object, but I'd like to save the provided value to invalid_passport if it doesn't match any Passport instances.

My resource looks like this;

class ResultResource(resources.ModelResource):
    """ Integrate django-import-export with the Result model """
    passport = fields.Field(
        column_name='passport',
        attribute='passport',
        widget=ForeignKeyWidget(Passport, 'id')
    )

    class Meta:
        instance_loader_class = CachedInstanceLoader
        model = Result
        fields = (
            'id',
            'passport',
            'invalid_passport'
        )

    def init_instance(self, row=None):
        """ Initializes a new Django model. """
        instance = self._meta.model()
        if not instance.passport:
            instance.invalid_passport = row['passport']
        return instance

    def before_save_instance(self, instance, using_transactions, dry_run):
        """ Sets the verified flags where there is a passport match. """
        if instance.passport:
            instance.verified = True
            instance.verified_by = self.import_job.author
            instance.invalid_passport = None
        else:
            pass

So when the instance is created, the value of the Passport ID is stored on invalid_passport. My theory being that in before_save_instance I can detect if the related Passport exists, except both sides of the if are being executed in before_save_instance.

Is this the wrong approach for this?

CodePudding user response:

One issue is that if the passport doesn't exist, then ForeignKeyWidget will raise a DoesNotExist exception and your row will not be imported and will be marked as an error. However, this might be ok for you, because you might only be interested in recording the rows with no errors.

So unless I am missing something, I can't see how before_save_instance() would be called, because the DoesNotExist exception is raised first, which means none of the save logic is called.

If you look at this line you can see that the exception is being added to an Error object, and then stored in the results for the import. This seems to be a good option for recording the invalid passport ids. In fact, the existing Error object would record the exception (DoesNotExist) and the row, so you would already have a record of the failing rows without any customisation. You could override either the Result or the Error object to have more control over this.

Another idea might be to override before_import() to do some pre-processing. In before_import(), you can maybe load all existing passport uuids into memory and then scan the dataset to remove any rows that don't exist. This might not be practical if you have large tables.

However, you do say that you want to record the missing UUID on the model, so you could introduce a custom ForeignKeyWidget subclass which doesn't raise a DoesNotExist exception, and maybe raises a ValidationError instead. In this case, the error will be stored in import_validation_errors. Then you could override validate_instance() to read the error back and store it on the model.

CodePudding user response:

ok, I think I have a solution maybe it's not ordered perfectly but I will tell you what is the fit tool that helped me a lot in such these tasks. so, I think that signals is the best way to do achieve this task successfully

so, I will tell you the steps that I got in pseudo-code and you can modify them based on your idea:

  • you will use pre_save to get the data that has been posted in the request
  • into the function that you will apply the signals on it, open your CSV file
  • take the UUID and search for it in a CSV file
  • make a comparison between the data that comes from the CSV file and the UUID
  • if data does not match any UUID or whatever your field that you want to compare with then, save that UUID in invalid_passport
  • otherwise, don't save it and return your response as succeeded view

you can read more about pre_save and signals in here: https://docs.djangoproject.com/en/4.0/ref/signals/

hint:
the quickest way to open a file and get your data from the file will be using dataframe in pandas if you are familiar with pandas you will achieve this task quicker than using an open file in Django and also, will use strong performance in search because in pandas you won't need to use a loop to search for much data opposite to normal loop in python

  • Related