Home > Net >  peewee - select only latest record for each foreign key
peewee - select only latest record for each foreign key

Time:04-29

I basically have the same problem as here. However, I want to implement it in Python 3.9 using the peewee framework.

Here's a simplified excerpt of the models in question:

from datetime import datetime
from peewee import DateTimeField, ForeignkeyField, Model


class System(Model):
    ...


class CheckResults(Model):
    system = ForeignKeyField(System, column_name='system', on_delete='CASCADE')
    timestamp = DateTimeField(default=datetime.now)

What I tried so far is:

from peewee import fn

CheckResults.select().join(t2 := CheckResults.select(CheckResults.system, max_timestamp := fn.MAX(CheckResults.timestamp)).group_by(CheckResults.system), on=((t2.system == CheckResults.system) & (t2.timestamp == CheckResults.timestamp)))

Which results in:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'ModelSelect' object has no attribute 'system'

How would I implement such a compound query in peewee?

CodePudding user response:

Check the docs here: http://docs.peewee-orm.com/en/latest/peewee/hacks.html#top-object-per-group

Here we'll get each user and the most recent tweet they made:

# When referencing a table multiple times, we'll call Model.alias() to create
# a secondary reference to the table.
TweetAlias = Tweet.alias()

# Create a subquery that will calculate the maximum Tweet created_date for each
# user.
subquery = (TweetAlias
            .select(
                TweetAlias.user,
                fn.MAX(TweetAlias.created_date).alias('max_ts'))
            .group_by(TweetAlias.user)
            .alias('tweet_max_subquery'))

# Query for tweets and join using the subquery to match the tweet's user
# and created_date.
query = (Tweet
         .select(Tweet, User)
         .join(User)
         .switch(Tweet)
         .join(subquery, on=(
             (Tweet.created_date == subquery.c.max_ts) &
             (Tweet.user == subquery.c.user_id))))

SQLite and MySQL are a bit more lax and permit grouping by a subset of the columns that are selected. This means we can do away with the subquery and express it quite concisely:

query = (Tweet
         .select(Tweet, User)
         .join(User)
         .group_by(Tweet.user)
         .having(Tweet.created_date == fn.MAX(Tweet.created_date)))
  • Related