How to execute SQL query with parameters in Pandas-CodePudding

I am using pyscopg2 and have a method that returns a list based on a SQL query. The below works fine...

def checkAnalysisStartDate(self):
    session = self.connection()
    cursor = session.cursor()
    ids = self.getAnalysisIds() # this is a list of integers
    cursor.execute("select start_date from analysis_run_history where analysis_id in %s", [tuple(ids)])
    final_result = [i[0] for i in cursor.fetchall()]

I want to pass the same list of integers but this time return a DataFrame from the results. When I try this though....

import pandas.io.sql as sqlio

def getAnalysisMetaStatsDF(self):
    session = self.connection()
    ids = self.getAnalysisIds() # this is a list of integers
    data = sqlio.read_sql_query("Select * from analysis_stats where analysis_id in %s", [tuple(ids)], session)
    print(data)

I get back...

AttributeError: 'list' object has no attribute 'cursor'

I think it's something with the way I am passing the parameters to read_sql_query() but I am not sure how to fix it

CodePudding user response：

Whenever you call a function make sure that the call matches its signature. You can look up the signature by running help(sqlio.read_sql_query), typing sqlio.read_sql_query? in IPython or in the documentation. You will find something like this:

Signature:
read_sql_query(
    sql,
    con,
    index_col=None,
    coerce_float: 'bool' = True,
    params=None,
    parse_dates=None,
    chunksize: 'int | None' = None,
    dtype: 'DtypeArg | None' = None,
) -> 'DataFrame | Iterator[DataFrame]'

Notice that the second parameter is the database connection object, not the parameters of the query. In your case a call like the following is more appropriate:

sqlio.read_sql_query(query_string, session, params = ids)

CodePudding user response：

import mysql.connector
from mysql.connector import Error
from sqlalchemy import create_engine
import pandas as pd

engine = create_engine("mysql://root:admin@localhost/pythondb", pool_size=10, max_overflow=20)

table = pd.read_sql("select *From emp", engine)