Home > Enterprise >  UUIDField vs Charfield with a UUID in it?
UUIDField vs Charfield with a UUID in it?

Time:07-16

So I've been using UUIDs in a CharField as PKs for a lot of things on a project I'm working on, works fine, no issues. If I use a UUIDField, certain things in the backend will have issues with a UUID field (usually functions that expect the UUIDs to be a string).

Is there any advantage of using a UUIDField vs just having 'default=uuid.uuid4' in a CharField?

CodePudding user response:

Is there any advantage of using a UUIDField vs just having default=uuid.uuid4 in a CharField?

Yes.

A UUIDField will for a database make use of a UUID type [postgresql-doc], this will store the UUID as a 128-bit quantity, which is more compact than storing it as a string, which will require 32 bytes or thus 256 bits (which takes two times more space).

If you would use a CharField(max_length=36, default=uuid.uuid4), you will even require using 36 characters, since a UUID will by default for a str add dashes in between, and not only use the hex value, and thus will add an additional four characters for these dashes, which are essentially the same, and thus wasting more storage.

But apart from that there are other advantages. Indeed, it will enforce that the length of the CharField is 32 characters long, and that the format is, if passed as a string, indeed a UUID. It will thus do proper validation on the format.

It can also parse several formats of a string into the same UUID. For example {12345678-1234-5678-1234-567812345678}, 12345678123456781234567812345678 and urn:uuid:12345678-1234-5678-1234-567812345678 will all be parsed to the same value, this will not be the case if you use a simple CharField, since then it considers these different strings. If you thus work with:

MyModel.objects.filter(pk='urn:uuid:12345678-1234-5678-1234-567812345678')

It will retrieve the record with that UUID, even though the value is stored in the database in a different format.

The values that it retrieves from the database are wrapped in a UUID type. This is "richer" than a string, and will prevent making mistakes. For example adding two UUIDs makes no sense. Indeed:

>>> uuid4()   uuid4()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for  : 'UUID' and 'UUID'

whereas for strings, it would result in concatenating the two strings.

Furthermore it works with a UUIDField [Django-doc] as form field, this makes it more convenient to hook a different widget to this type of field, and thus render a form differently.

It thus provides more context about what type of data it expects, and usually better context means that the modules that introspect a model, can thus do a better job.

CodePudding user response:

The UUIDField is essentially a 32-character long CharField that validates that a valid UUID has been provided. According to the documentation, a UUID type will be saved in Postgres but as a CharField in any other DB.

https://docs.djangoproject.com/en/4.0/ref/models/fields/#uuidfield

I don't see why you couldn't just use a CharField with the default=uuid.uuid4 (which you need on UUIDField anyway), you just lose the automatic validation.

  • Related