Django - difference in time between big serializer and small-CodePudding

I'm creating a music rating app and I'm using REST Framework to create API in Django. It's super easy but I'm wondering if there is any big difference in loading time when using big serializers model and small. By big and small I mean like in getting more data. For instance I have a album page where I need to use this serializer.

"id": 2,
"title": "OK Computer",
"slug": "ok-computer",
"created_at": "2022-02-22T21:51:52.528148Z",
"artist": {
    "id": 13,
    "name": "Radiohead",
    "slug": "radiohead",
    "image": "http://127.0.0.1:8000/media/artist/images/radiohead.jpg",
    "background_image": "http://127.0.0.1:8000/media/artist/bg_images/radiohead.jpg",
    "created_at": "2022-02-22T00:00:00Z"
},
"art_cover": "http://127.0.0.1:8000/media/album/art_covers/ok-computer_cd5Vv6U.jpg",
"genres": [
    "Alternative Rock",
    "Art Rock"
],
"overall_score": null,
"number_of_ratings": 0,
"release_date": "1997-05-28",
"release_type": "LP",
"tracks": [
    {
        "position": 1,
        "title": "Airbag",
        "duration": "00:04:47"
    },
    {
        "position": 2,
        "title": "Paranoid Android",
        "duration": "00:06:27"
    }
],
"links": [
    {
        "service_name": "spotify",
        "url": "https://open.spotify.com/album/6dVIqQ8qmQ5GBnJ9shOYGE?si=L_VNH3HeSMmGBqfiqKiGWA"
    }
],
"aoty": null

This serializer is rather massive and I only need this data for Albums details page. I also pull this data in Albums list page where I list all my albums and almost all of this data is not used. If I make another serializer, little less complex and use it in albums list page, will there be a drastic difference in load speed?

And if so, can I make a Viewset where the less complex serializer is visible when I access my /albums api url and the more complex serializer is displayed when I access more specific url like /albums/1?

CodePudding user response：

As you are concerned about the speed to load the objects, There is another way to improve performance. Like there are a number of methods we can use

ModelSerializer (Slower)
Read Only ModelSerializer (A little bit faster than ModelSerializer)
Regular Serializer Read Only
regular Serializer (almost 60 % faster than ModelSerializer)

Because In the writable ModelSerializer, a lot of time is spent on validations. So we can make it faster by marking all fields as read-only.

A lot of articles were written about serialization performance in Python. As expected, most articles focus on improving DB access using techniques like select_related and prefetch_related. While both are valid ways to improve the overall response time of an API request, they don't address the serialization itself.

And Yes you should go for multiple serializers instead of a BIG NESTED ONE

CodePudding user response：

It depends. Usually, limiting the response with data that only the frontend/users needs it a good practice. Of course, what happens is that those needs will evolve accross your views and pages on frontend. One way to overcome that is to provide different serializer for different views or query params, using for instance the get_serializer_class method on your viewset' or in the serializer itself with the request object. Also, if I remember correctly, there is an extension that allows you to define some fields you want to have back or not.

A DRF serializer isn't only doing serializer 'purely speaking', because you can redefine fields, you can have method fields, and obvisouly, relantionfield.

Most of the time, on fields like IntegerField, CharFields, etc, you will not have of issues with serializing lots of data, because it's straightforward. But with RelatedFields (foreignkey, ManyToMany ...), this can cause some problems if you don't prefetch them: relationship and nested relationship will create a new database query for all your items.

For instance, in your example you have Album and tracks. If you don't prefetch tracks before fetching album, you will create a request for each album in your query ! This is because the serializer will try to serializer each field, and when it see the track field, Django will fetch the object from the database. Doing so will be noticeable, and will not scale at all, even with a small dataset.

Another way to deal with those problems it to do pagination, while it will still sometimes be slow, it allows you to only give a subset of your database, and thus only have limited of items to serialize.

To summarize: depends on what you serialize, usually will be slow because of not prefetching data, or because of method fields. Try to use pagination as much as you can, and yes, you can use a different serializer if you know that it will have way less data (and so having a faster response/less data on wire).

Note: when dealing with really complex objects where you don't want to have a large number of serializers, you could use GraphQL, where the user knows what he wants and ask the backend about those fields.