Home > Back-end >  Efficiently get first and last model instances in Django Model with timestamp, by day
Efficiently get first and last model instances in Django Model with timestamp, by day


Suppose you have this model:

from django import models
from django.contrib.postgres.indexes import BrinIndex

class MyModel(model.Models):
  device_id = models.IntegerField()
  timestamp = models.DateTimeField(auto_now_add=True)
  my_value = models.FloatField()

  class Meta:
    indexes = (BrinIndex(fields=['timestamp']),)

There is a periodic process that creates an instance of this model every 2 minutes or so. This process is supposed to run for years, with multiple devices, so this table will contain a great number of records.

My goal is, for each day when there are records, to get the first and last records in that day.

So far, what I could come up with is this:

from django.db.models import Min, Max

results = []
device_id = 1 # Could be other device id, of course, but 1 for illustration's sake

# This will get me a list of dictionaries that have first and last fields 
# with the desired timestamps, but not the field my_value for them.

first_last = MyModel.objects.filter(device_id=device_id).values('timestamp__date')\

# So now I have to iterate over that list to get the instances/values
for f in first_last:

    first = f['first']
    last = f['last']

    first_value = MyModel.objects.get(device=device, timestmap=first).my_value
    last_value = MyModel.objects.get(device=device, timestamp=last).my_value

      'first': first,
      'last': last,
      'first_value': first_value,
      'last_value': last_value,

# Do something with results[]

This works, but takes a long time (about 50 seconds on my machine, retrieving first and last values for about 450 days).

I have tried other combinations of annotate(), values(), values_list(), extra() etc, but this is the best I could come up with so far.

Any help or insight is appreciated!

CodePudding user response:

You can take advantage of .distinct() if you are using PostgreSQL as DBMS.

first_models = MyModel.objects.order_by('timestamp__date', 'timestamp').distinct('timestamp__date')
last_models = MyModel.objects.order_by('timestamp__date', '-timestamp').distinct('timestamp__date')
first_last = first_models.union(last_models)

# do something with first_last

One more things need to be mentioned: first_last might eliminate duplicate when there is only one record for a date. It should not be a problem for you, but if it does, you can iterate first_models and last_models separately.

  • Related