Home > front end >  How to filter after two joins in flask SQL alchemy
How to filter after two joins in flask SQL alchemy

Time:03-26

I have the following 3 tables:

class Sample(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    ...
    all_prosampairs = db.relationship('Prosampair', order_by="Prosampair.order", backref='modifies_sample', lazy=True, cascade="all, delete", passive_deletes=True)

class Process(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    date = db.Column(db.DateTime)
    ...
    all_prosampairs = db.relationship('Prosampair', backref='in_process', lazy=True, cascade="all, delete", passive_deletes=True)

class Prosampair(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    order = db.Column(db.Integer, nullable=False)
    ...
    process_id = db.Column(db.Integer, db.ForeignKey('process.id', ondelete="CASCADE"), nullable=False)
    sample_id = db.Column(db.Integer, db.ForeignKey('sample.id', ondelete="CASCADE"), nullable=False)

Each sample has a chain of processes it goes through and I need to get a table which includes both Sample.id and the latest date it has encountered a process, so that I can filter by this date.

Something like this

What I tried so far was looking at this and similar solutions:

https://blog.miguelgrinberg.com/post/nested-queries-with-sqlalchemy-orm

But I can't quite comprehend how to deal with this third association table that I have yet.

 subqjoin1 = db.session.query(Sample).join(Prosampair).join(Process).subquery()
 subqjoin2 = db.session.query(subqjoin1, db.func.max(Process.date).label('latest_process')).group_by(Sample.id).subquery()
 result = db.session.query(subqjoin2).filter( subqjoin2.c.latest_process <= value ).all()

I am getting variations of

sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such column: sample.id

and I am afraid I don't quite grasp how to access the parameters that I want to. Two subqueries are probably excessive, but I struggle to figure out a better way

CodePudding user response:

Make sure to reference columns in the subquery off of .c, for example subq.c.latest_process.

with Session(engine) as session:
    latest_subq = session.query(
        Sample.id.label("sample_id"),
        func.max(Process.date).label("latest_process")
    ).outerjoin(
        Prosampair, Sample.id == Prosampair.sample_id
    ).outerjoin(
        Process, Prosampair.process_id == Process.id
    ).group_by(
        Sample.id
    ).subquery()

    q = session.query(Sample, latest_subq.c.latest_process).join(latest_subq, Sample.id == latest_subq.c.sample_id).order_by(latest_subq.c.latest_process)

    for sample, latest_process in q.all():
        print (sample.id, latest_process)

  • Related