Home > front end >  SQLAlchemy group by day in timestamp column
SQLAlchemy group by day in timestamp column

Time:12-14

I have an ORM (Model) defined in SQLAlchemy as below:

class StoreView(Base):
    __tablename__ = 'store_views'

    id = Column(Integer, primary_key=True)
    store_id = Column(Integer)
    started_from = Column(TIMESTAMP)
    end_to = Column(TIMESTAMP)
    average_watch_time = Column(Float)
    total_watch_time = Column(Float)
    total_views = Column(Float)

I'm planning to get the sum of all views in each day and tried to group the results based on their end_to. I have used the following query in sqlalchemy:

result = session.query(
                StoreView
                ).filter(
                    StoreView.started_from > from_date,
                    StoreView.end_to < to_date,
                    StoreView.store_id==5
                ).group_by( sa.func.year(StoreView.end_to), sa.func.month(StoreView.end_to)).all()

But this query throws this error:

(psycopg2.errors.UndefinedFunction) function year(timestamp without time zone) does not exist
HINT:  No function matches the given name and argument types. You may need to add explicit type casts.

I am using the timestamp in my model and I don't plan to change it because of some reason. The only thing I can do is modify the query. The SQLAlchemy is connected to AWS Redshift.

CodePudding user response:

If you wanted to group by the number of views per day in Postgresql the query would look like this (WHERE clause omitted):

SELECT end_to::date AS date, COUNT(*) AS views    
FROM store_views    
GROUP BY end_to::date    
ORDER BY date DESC;

The trick to dealing with the timestamp is to cast it to the date type, which truncates the value to the date part. In SQLAlchemy the equivalent code would be

with Session() as s:    
    result = s.query(    
        sa.cast(StoreView.end_to, sa.Date).label('date'),    
        sa.func.count().label('views'),    
    ).filter(    
    ).group_by(    
        sa.cast(StoreView.end_to, sa.Date),    
    ).order_by(    
        sa.text('date desc')    
    )    
    for row in result:    
        print(row) 

Resulting in values like

(datetime.date(2021, 5, 4), 1)
(datetime.date(2021, 5, 3), 1)
(datetime.date(2021, 5, 2), 2)
...

The equivalent SQLAlchemy 2.0 style query would be

with Session() as s:
    q = sa.select(
        sa.cast(StoreView.end_to, sa.Date).label('date'),
        sa.func.count().label('views'),
    ).where(
        StoreView.started_from > from_date,
        StoreView.end_to < to_date,
        StoreView.store_id == 5,
    ).group_by(
        sa.cast(StoreView.end_to, sa.Date),
    ).order_by(
        sa.text('date desc')
    )
    result = s.execute(q)
  • Related