Home > database >  How to loop through queries and add the results to a single dataframe?
How to loop through queries and add the results to a single dataframe?

Time:06-02

I have a Flask application that currently makes a connection to a remote server and then a particular DB. I have a function that takes the users date range input and then queries data based upon the given dates.

from flask import make_response, redirect, render_template, session, url_for, Blueprint,request
import pandas as pd
import pymssql
from core.database import ConnectionFactory
from core.query import *

dat_a = Blueprint('dat_a', __name__)

ButtonPressed = 0
@dat_a.route('/download', methods=['POST','GET'])
def download():
    if session.get('logged_in') != True:
        return redirect(url_for('views.index'))
    else:
        if request.method == 'POST':
            return render_template('data.html')
        else:
            Start = request.args.get('Start Date')
            End = request.args.get('End Date')
            connection = pymssql.connect(ConnectionFactory.SQL_HOST, ConnectionFactory.SQL_USER_xxx, ConnectionFactory.SQL_PW_xxx,ConnectionFactory.SQL_DB_xxx)
            cur=connection.cursor()
            query_1 = """
            select some column from table
            from xx
             where date BETWEEN (%s) AND (%s)
            """

            query_2 = """
            select some column from table     
            """
            results = []
            q_list = [query_1,query_2]
            for query in q_list:
                cur.execute(query,(Start,End))
                results.append(cur)
                print(results)
            columns = [d[0] for d in cur.description]
            data = pd.DataFrame(cur.fetchall(),columns=columns)
            print(data)
            resp = make_response(data.to_csv())
            resp.headers["Content-Disposition"] = "attachment; filename=Data.csv"
            resp.headers["Content-Type"] = "text/csv"
            return resp

The problem I'm having is getting my for loop to store the data from query_1 into a list and then move it to my dataframe. Currently data from query_2 will be stored into the list and then pushed to the dataframe, but that's inaccurate as it's just fetching random data at that point.

results = []
q_list = [query_1,query_2]
for query in q_list:
    cur.execute(query,(Start,End))
    results.append(cur)
    print(results)

columns = [d[0] for d in cur.description]

data = pd.DataFrame(cur.fetchall(),columns=columns)

I have tried a nested for loop that calls upon each query separately, has it's own connection, and dataframe but that didn't change the results either. Is there a more efficient way to go about this with pandas?

CodePudding user response:

You might be looking for something like this...

cur = connection.cursor()
results = []
for query, args in [
    ("select some_column from xx where date BETWEEN (%s) AND (%s)", (Start, End)),
    ("select some_column from yy", ()),
]:
    cur.execute(query, args)
    results.extend(cur.fetchall())
    # Will be overwritten on each iteration, but it's fine as long as the columns are the same for each query
    columns = [d[0] for d in cur.description]  
data = pd.DataFrame(results, columns=columns)
  • Related