Home > Mobile >  How can I pass parameters for the IN clause of a SELECT query to retrieve a pandas DataFrame?
How can I pass parameters for the IN clause of a SELECT query to retrieve a pandas DataFrame?

Time:03-11

Trying to pass params to a simple SELECT query :

query = """SELECT * from tbl_tab where name in {};"""

I went through this code to manipulate my dataframe and the idea is to be able to pass 1 to n params into my query :

conn = pyodbc.connect(conx_string)
t = tuple(["N","M"])
crsr = conn.cursor()
data = crsr.execute(query.format(t))
rows = [list(x) for x in data]
columns = [column[0] for column in crsr.description]
df = pd.DataFrame(rows, columns=columns)

And I got the expected result. However, when passing a single param to t : having t = tuple(["N"])

I got the error ProgrammingError: ('42000', "[42000] [Microsoft][ODBC SQL Server Driver][SQL Server]Incorrect syntax near ')'. (102) (SQLExecDirectW)")

Any idea please ?

CodePudding user response:

You are using the string representation of a tuple to inject into your SQL statement. It works for tuple(["N","M"]) because you get … IN ('N', 'M'). However, it fails for tuple(["N"]) because the result is … IN ('N',) and the trailing comma is not valid SQL syntax. Consider an approach that avoids SQL injection and uses a parameterized query.

First, if you are using pandas with any database other than SQLite you should be using SQLAlchemy. SQLAlchemy will automatically build the required SQL statement when you pass the select() object to pandas .read_sql_query() method:

import pandas as pd
import sqlalchemy as sa

engine = sa.create_engine(
    "mssql pyodbc://scott:tiger^5HHH@mssql_199",
)
tbl_tab = sa.Table("tbl_tab", sa.MetaData(), autoload_with=engine)

# full contents of table:
with engine.begin() as conn:
    print(conn.execute(sa.select(tbl_tab)).fetchall())
    # [(1, 'Alicia'), (2, 'Brandon'), (3, 'Candace')]

# retrieve subset of rows into a DataFrame
name_list = ["Alicia", "Brandon"]
my_select = sa.select(tbl_tab).where(tbl_tab.c.name.in_(name_list))
df = pd.read_sql_query(my_select, engine)
print(df)
"""
   id     name
0   1   Alicia
1   2  Brandon
"""
  • Related