I'm setting up an import process for a model which has a ForeignKey
relationship, and I've got an extra field on the model where I'd like to store the provided value if it's an invalid ID of the related table. So if a Passport
object doesn't exist for the UUID in the imported file, the value in the file gets added to the invalid_passport
field. This way it can be viewed via django admin and potentially used after the import process is complete.
The ID's in question here are UUIDs, so there's a good chance that someone along the line will provide some invalid data.
class Result(models.Model):
id = models.UUIDField(
primary_key=True,
default=uuid.uuid4,
editable=False
)
passport = models.ForeignKey(
to='results.Passport',
null=True,
blank=True,
on_delete=models.SET_NULL
)
invalid_passport = models.CharField(
max_length=255,
blank=True,
null=True,
help_text=_("An invalid passport number")
)
The import file will contain a passport
column for the UUID of the related object, but I'd like to save the provided value to invalid_passport
if it doesn't match any Passport
instances.
My resource looks like this;
class ResultResource(resources.ModelResource):
""" Integrate django-import-export with the Result model """
passport = fields.Field(
column_name='passport',
attribute='passport',
widget=ForeignKeyWidget(Passport, 'id')
)
class Meta:
instance_loader_class = CachedInstanceLoader
model = Result
fields = (
'id',
'passport',
'invalid_passport'
)
def init_instance(self, row=None):
""" Initializes a new Django model. """
instance = self._meta.model()
if not instance.passport:
instance.invalid_passport = row['passport']
return instance
def before_save_instance(self, instance, using_transactions, dry_run):
""" Sets the verified flags where there is a passport match. """
if instance.passport:
instance.verified = True
instance.verified_by = self.import_job.author
instance.invalid_passport = None
else:
pass
So when the instance is created, the value of the Passport
ID is stored on invalid_passport
. My theory being that in before_save_instance
I can detect if the related Passport
exists, except both sides of the if
are being executed in before_save_instance
.
Is this the wrong approach for this?
CodePudding user response:
One issue is that if the passport doesn't exist, then ForeignKeyWidget
will raise a DoesNotExist
exception and your row will not be imported and will be marked as an error. However, this might be ok for you, because you might only be interested in recording the rows with no errors.
So unless I am missing something, I can't see how before_save_instance()
would be called, because the DoesNotExist
exception is raised first, which means none of the save logic is called.
If you look at this line you can see that the exception is being added to an Error
object, and then stored in the results for the import. This seems to be a good option for recording the invalid passport ids. In fact, the existing Error
object would record the exception (DoesNotExist
) and the row, so you would already have a record of the failing rows without any customisation. You could override either the Result
or the Error
object to have more control over this.
Another idea might be to override before_import()
to do some pre-processing. In before_import()
, you can maybe load all existing passport uuids into memory and then scan the dataset to remove any rows that don't exist. This might not be practical if you have large tables.
However, you do say that you want to record the missing UUID on the model, so you could introduce a custom ForeignKeyWidget
subclass which doesn't raise a DoesNotExist
exception, and maybe raises a ValidationError
instead. In this case, the error will be stored in import_validation_errors
. Then you could override validate_instance()
to read the error back and store it on the model.
CodePudding user response:
ok, I think I have a solution maybe it's not ordered perfectly but I will tell you what is the fit tool that helped me a lot in such these tasks. so, I think that signals is the best way to do achieve this task successfully
so, I will tell you the steps that I got in pseudo-code and you can modify them based on your idea:
- you will use pre_save to get the data that has been posted in the request
- into the function that you will apply the signals on it, open your CSV file
- take the UUID and search for it in a CSV file
- make a comparison between the data that comes from the CSV file and the UUID
- if data does not match any UUID or whatever your field that you want to compare with then, save that UUID in invalid_passport
- otherwise, don't save it and return your response as succeeded view
you can read more about pre_save and signals in here: https://docs.djangoproject.com/en/4.0/ref/signals/
hint:
the quickest way to open a file and get your data from the file will be using dataframe in pandas if you are familiar with pandas you will achieve this task quicker than using an open file in Django and also, will use strong performance in search because in pandas you won't need to use a loop to search for much data opposite to normal loop in python