Suppose you have this model:
from django import models
from django.contrib.postgres.indexes import BrinIndex
class MyModel(model.Models):
device_id = models.IntegerField()
timestamp = models.DateTimeField(auto_now_add=True)
my_value = models.FloatField()
class Meta:
indexes = (BrinIndex(fields=['timestamp']),)
There is a periodic process that creates an instance of this model every 2 minutes or so. This process is supposed to run for years, with multiple devices, so this table will contain a great number of records.
My goal is, for each day when there are records, to get the first and last records in that day.
So far, what I could come up with is this:
from django.db.models import Min, Max
results = []
device_id = 1 # Could be other device id, of course, but 1 for illustration's sake
# This will get me a list of dictionaries that have first and last fields
# with the desired timestamps, but not the field my_value for them.
first_last = MyModel.objects.filter(device_id=device_id).values('timestamp__date')\
.annotate(first=Min('timestamp__date'),last=Max('timestamp__date'))
# So now I have to iterate over that list to get the instances/values
for f in first_last:
first = f['first']
last = f['last']
first_value = MyModel.objects.get(device=device, timestmap=first).my_value
last_value = MyModel.objects.get(device=device, timestamp=last).my_value
results.append({
'first': first,
'last': last,
'first_value': first_value,
'last_value': last_value,
})
# Do something with results[]
This works, but takes a long time (about 50 seconds on my machine, retrieving first and last values for about 450 days).
I have tried other combinations of annotate()
, values()
, values_list()
, extra()
etc, but this is the best I could come up with so far.
Any help or insight is appreciated!
CodePudding user response:
You can take advantage of .distinct()
if you are using PostgreSQL as DBMS.
first_models = MyModel.objects.order_by('timestamp__date', 'timestamp').distinct('timestamp__date')
last_models = MyModel.objects.order_by('timestamp__date', '-timestamp').distinct('timestamp__date')
first_last = first_models.union(last_models)
# do something with first_last
One more things need to be mentioned: first_last
might eliminate duplicate when there is only one record for a date. It should not be a problem for you, but if it does, you can iterate first_models
and last_models
separately.