i've got a python 3.8 script that uses the libraries pyobdc and concurrent.futures to query on a DB.
NB i want to clerify that i know i can have more than one connection and i know for sure that it's happening. what i cannot understand is where my bottleneck is
the structure is: I have one script (call it queries) that is imported in the second script (call it main) as a module. queries is a collection of functions, all with the same structure:
def getquery1():
connection = pyobdbc.connect('connection string')
query = 'SELECT * FROM table'
with connection:
result_dataset = pd.read_sql(con=connection,sql=query)
connection.close()
return result_dataset
in main I import that file as a module. then i recall all the data as:
import queries
from concurrent.futures import ThreadPoolExecutor
import pyodbc
executor = ThreadPoolExecutor()
futureQuery1 = executor.submit(queries.getquery1)
futureQuery2 = executor.submit(queries.getquery2)
futureQuery3 = executor.submit(queries.getquery3)
result1 = futureQuery1.result()
result2 = futureQuery2.result()
result3 = futureQuery3.result()
the infrastructure is: i have this script that runs on a linux machine and connects to a microsoft SQL. the driver used is FreeTDS on SQL i got this parameters:
print(connection.getinfo(pyodbc.SQL_MAX_DRIVER_CONNECTIONS)) --> 0 (unlimited connections)
print(connection.getinfo(pyodbc.SQL_MAX_CONCURRENT_ACTIVITIES)) --> 1
print(connection.getinfo(pyodbc.SQL_MAX_ASYNC_CONCURRENT_STATEMENTS)) --> 1
the thing is:
if query1 and query 2 are on the same connection, those two queries are executed at the same time? or query2 must wait the end of query1?
and if query3 connect to another database can that query be executed at the same time of the first two?
where i can find those parameters?
i'm usually in the case where the main script runs at the same time from two different machines that connects to the same docker
USEFULL LINKS:
concurrent.futures https://docs.python.org/3/library/concurrent.futures.html
sql get info: https://docs.microsoft.com/en-us/sql/odbc/reference/syntax/sqlgetinfo-function?view=sql-server-ver15
CodePudding user response:
There is a protocol to allow a single connection to have multiple concurrent queries running at the same time. It's called Multiple Active Result Sets, and I have no idea if FreeTDS uses it, but the Microsoft ODBC Driver can.
However the execution of mutliple queries on a single connection is interleaved not concurrent. If you want two queries to actually execute at the same time, use two connections.