Home > Software design >  Alternative to using `regroup` in Django template on `ListView` using a lot of memory
Alternative to using `regroup` in Django template on `ListView` using a lot of memory

Time:09-30

I have tried using the following to regroup a ListView queryset in my template so that objects are grouped by a related field's value:

<div >
    {% regroup object_list by related_field.sort_value as grouped_list %}
    {% for group in grouped_list %}
        <div >  
        <span>... {{group.grouper}} ...</span>
        {% for item in group.list %}
            <p>{{item.related_field.name}}</p>
            <p>{{item.description}}</p>
        {% endfor %}
        </div>
    {% endfor %}
</div>

The model has about a dozen fields, two of them relational (first related model has a couple dozen fields, the other has just a few) and there's several hundred total objects in the ListView queryset results.

View:

    class BigListView(ListView):
        model = MyModel
        template_name = 'my_model/big_list_view.html'
        def get_queryset(self):
            return MyModel.objects.order_by('related_model_b.sort_value')

Models:

class MyModel(models.Model):

    code = models.CharField(max_length=30, unique=True)
    price = models.DecimalField(max_digits=8, decimal_places=2)
    sale_price = models.DecimalField(max_digits=8, decimal_places=2)
    sale_price_quantity_3_5 = models.DecimalField(max_digits=8, decimal_places=2, blank=True, null=True)
    sale_price_quantity_6 = models.DecimalField(max_digits=8, decimal_places=2, blank=True, null=True)
    description = models.CharField(max_length=40)
    location = models.CharField(max_length=20)
    quantity_on_hand = models.IntegerField()
    size = models.CharField(max_length=20)
    tag_description = models.CharField(max_length=100)

    color = ColorField(help_text='Theme Color')

    image = models.ImageField(upload_to=assign_product_sku, blank=True, null=True)

    created_at = models.DateTimeField(auto_now_add=True)
    updated_at = models.DateTimeField(auto_now=True)

    related_model_a = models.ForeignKey(RelatedModelA, related_name='mymodels', on_delete=models.CASCADE)

    related_model_b = models.ForeignKey('RelatedModelB', related_name='mymodels', on_delete=models.CASCADE, blank=True, null=True)

class RelatedModelA(index.Indexed, models.Model):

    sku = models.CharField(max_length=20, unique=True)
    name = models.CharField(max_length=100)
    alt_name = models.CharField(max_length=100, blank=True)
    description = models.TextField(blank=True)
    age = models.CharField(max_length=3, blank=True)
    size = models.CharField(max_length=100, blank=True)
    weight = models.CharField(max_length=100, blank=True)
    shape = models.CharField(max_length=100, blank=True)
    style = models.CharField(max_length=100, blank=True)
    rate = models.CharField(max_length=100, blank=True)
    attribute_a = models.CharField(max_length=100, blank=True)
    attribute_b = models.CharField(max_length=100, blank=True)
    attribute_c = models.CharField(max_length=100, blank=True)
    attribute_d = models.CharField(max_length=100, blank=True)
    attribute_e = models.CharField(max_length=100, blank=True)
    attribute_f = models.CharField(max_length=100, blank=True)
    attribute_g = models.CharField(max_length=100, blank=True)
    attribute_h = models.CharField(max_length=100, blank=True)
    attribute_i = models.CharField(max_length=100, blank=True)
    attribute_j = models.CharField(max_length=100, blank=True)
    attribute_k = models.CharField(max_length=100, blank=True)
    attribute_l = models.CharField(max_length=100, blank=True)
    attribute_m = models.CharField(max_length=100, blank=True)
    attribute_n = models.CharField(max_length=100, blank=True)
    attribute_o = models.CharField(max_length=100, blank=True)
    attribute_p = models.CharField(max_length=100, blank=True)
    attribute_q = models.CharField(max_length=40, blank=True)
    attribute_r = models.CharField(max_length=100, blank=True)
    attribute_s = models.CharField(max_length=100, blank=True)
    comment = models.CharField(max_length=100, blank=True)

    created_at = models.DateTimeField(auto_now_add=True)
    updated_at = models.DateTimeField(auto_now=True)

    image_url = models.URLField(blank=True)

    slug = models.SlugField(max_length=100, blank=True)
    tags = TaggableManager(blank=True)

class RelatedModelB(models.model):
    code = models.CharField(max_length=30, unique=True)
    sort_value = models.IntegerField()
    last_retrieved = models.DateTimeField(auto_now=True)
    json_data = models.JSONField(blank=True, null=True)

Viewing this on local/dev machine has no problems, but hitting the view on production Heroku it crashes the app because of excess memory usage (~2GB). The app is currently running on Hobby (512MB) and while I will likely bump this up to Standard 1x/2x, that's still well short on memory unless I go up to a higher performance dyno which is overkill with the exception of this single view.

What can I do to reduce this view's memory usage?

CodePudding user response:

For starters you access related_field (related_model_b/related_model_a?) for each item in your queryset which is performing a DB query each time, use select_related() to fetch the related data in a single query

MyModel.objects.select_related('related_model_b', 'related_model_a').order_by(...)

Because you already sort by related_field.sort_value you can get away with using the ifchanged tag instead of regroup if regroup is a massive problem

    {% for item in object_list %}
        {% ifchanged item.related_field.sort_value %}<span>{{ item.related_field.sort_value }}</span>{% endifchanged %}
        <p>{{item.related_field.name}}</p>
        <p>{{item.description}}</p>
    {% endfor %}

CodePudding user response:

I ended up with a combination of Iain's suggestion to use selected_related and also using defer() to exclude some of the fields in my related models that aren't necessary for this view.

Specifically, I believe (but haven't tested) that the JSONField attached to a related model is the culprit, as this field normally holds just a couple kB of data, but in some cases it's holding ~400kB.

I changed the queryset in my view to the following:

class BigListView(ListView):
    model = MyModel

    template_name = 'my_model/big_list.html'

    def get_queryset(self):
        return MyModel.objects.select_related('related_model_a','related_model_b').order_by('related_model_b__sort_value').defer(
            'related_model_b__json_data',
            ... additional fields ...
        )
  1. Original memory usage was nearly 2gB.
  2. Using select_related only reduced that to ~800mB.
  3. Using select_related and defer() reduced it down to ~200mB, which is no different than the app typically uses for any given operation.
  • Related