Django convert model objects to dictionary in large volumes results in server timeout-CodePudding

I have been having a problem where a Django server takes forever to return a response. When running with gunicorn in Heroku I get a timeout, so can't receive the response. If I run locally, it takes a while, but after some time it correctly shows the site.

Basically I have two models:

class Entry(models.Model):
    #Some stuff charFields, foreignKey, TextField and ManyToMany fields
    #Eg of m2m field:
    tags = models.ManyToManyField(Tag, blank=True)

    def getTags(self):
        ret = []
        for tag in self.tags.all():
            ret.append(getattr(tag, "name"))
        return ret

    def convertToDict(self):
        #for the many to many field I run the getter
        return {'id': self.id, 'tags' : self.getTags(), ... all the other fields ... }

class EntryState(models.Model):
    entry = models.ForeignKey(Entry, on_delete=models.CASCADE)
    def convertToDict(self):
        temp = self.entry.convertToDict()
        temp['field1'] = self.field1
        temp['field2'] = self.field1   self.field3
        return temp

Then I have a view:

def myView(request):
    entries = list(EntryState.objects.filter(field1 < 10))

    dictData = {'entries' : []}
    for entry in entries:
        dictData['entries'].append(entry.convertToDict())

    context = {'dictData': dictData}
    return render(request, 'my_template.html', context)

The way it was done, the dictData contains the information for all the entries that must be shown in the site. Then, this variable gets loaded into a javascript that will decide how to display the information.

The view gets stuck in the for loop when there's lots of entries with field1 < 10.

I was just wondering, what can be done to improve this approach. With 600 entries it already takes forever, enough for gunicorn to timeout.

In case it is relevant, I am using Django 4.0.3 and Python3. The data is used in the frontend by Alpine.js to render the site.

CodePudding user response：

At first glance, your main problem is most likely you're hitting the database twice for each EntryState instance.

convertToDict method makes use of FK entry and for each entry, you also fetch the M2M tags. Solution is to optimize the query.

First, let's identify the problem. When you put this code before the end of the view, you'll see how many times you're hitting the database in the console.

def myView(request):
    ....
    from django.db import connection; qq=connection.queries; print(f'QUERY: {len(qq)}')
    return render(request, 'my_template.html', context)

Now, try to improve the query in the view.

query = EntryState.objects.filter(field1 < 10)  # Assuming this is a valid query that uses something like `field1__lt=10`

You can use select_related for the Entry FK so that it'll be retrieved within that single query (Saves N hits to the DB)

query = EntryState.objects.filter(field1__lt=10).select_related('entry')

Also you can add prefetch related to retrieve the tags for ALL entries in one hit.

query = EntryState.objects.filter(field1__lt=10).select_related('entry').prefetch_related('entry__tags_set')

After each change, you can see how many times you hit to the database so that you can see the problem getting fixed. You can check the number of DB hits again to ensure it's optimized.

For more info, please read about query optimization, select_related and prefetch_related.

You can also use some apps to monitor the performance of your views: django-silk, django-debug-toolbar

Source: https://docs.djangoproject.com/en/4.0/ref/models/querysets/#select-related

CodePudding user response：

You are attempting to render a template containing huge amounts of data.

In this case I think you should consider splitting the data you send into template and entries as two different request types. Meaning you send the template, and then your alpine.js loads the entries one by one. (Edit: I mean in packages of 50 or so, please don't make a request for each entry...)

For this you can use either a ready made REST Framework Application for django or deploy your own solution.

Of course that way the user will not see any data after loading the template. Instead they will have to wait for the second request. The advanced solution to this is rendering only the first 100 entries or so which the user will see, and use your new REST API for the remaining data. You could use the lazy loading alpine.js plugin Intersect, it will help your performance overall aswell!