I am querying an Oracle Database via a python script.
The python script will:
- Select one large list of 'ITEM_ID's. The list will be dynamic. Sometimes it could contain a few ITEM_ID's, other times it may contain thousands.
- Pass the list into the WHERE clause in an SQL stored as a string within the python script. So something like:
'''
sql =
'''
select
item_id,
from
item_table
where
item_id in {0}
'''.format(mylist)
Issue: In Oracle when executing the above where clause, the the list can't contain more than a 1000 items. This is a known limitation. So, one could write a SQL like so:
select
item_id,
from
item_table
where
(
item_id in (list or 1000 items or less)
or
item_id in (list or 1000 items or less)
or
item_id in (list or 1000 items or less)
or
...
)
So, in order to overcome the limitation and re-write the SQL string, I can 'chunk' down the original list into individual lists of 1000 items or less, so that I can execute a query. In python I 'chunk' the list down like this:
if len(mylists) > 1000:
chunks = [mylist[i:i 1000] for i in xrange(0, len(mylist), 1000)]
Now I can iterate over the 'chunks' and get individual lists of a 1000 ITEM_ID's or less:
for i in chunks:
print tuple(i) # need to cast each individual list to tuple so the brackets are rounded which is how Oracle accepts lists
Ok, so I have an approach to process the list into 1000 items or less, but now I need to construct the 'WHERE' clause string so that it is dynamic. Given that the SQL will look like this:
select
item_id,
from
item_table
where
(
item_id in (first chunked lsit)
or
item_id in (second chunked list)
or
...
)
I was thinking about doing something like:
sql =
'''
select
item_id,
from
item_table
where
(
{0}
)
'''.format(mywhereclause)
which means I need to take the chunks of lists I created above, concatenate the 'IN' operator and the 'OR' operator and make sure I do it in a dynamic way so that the amount of 'IN LIST or IN LIST or IN LIST...' grows or shrinks based on how many chunked lists occur.
I feel like this isn't overly difficult, however I have been trying numerous methods and I am running out of steam. I am hoping I have clearly explained what I am looking to do. But to summarize, I just need the WHERE CLAUSE within the string representation of the SQL to dynamically accept lists of a 1000 items. Any suggestions are appreciated!
CodePudding user response:
you might need to concatenate the chunks together to form all "or" conditions
all_conditions = list()
for chunk in chunks:
condition = "item_id in {0}".format(chunk)
all_conditions.append(condition)
all_conditions_str = " or ".join(all_conditions)
sql = '''
select
item_id,
from
item_table
where
{0}
'''.format(all_conditions_str)
CodePudding user response:
The first thing to do is read the manual: Binding Multiple Values to a SQL WHERE IN Clause.
For very large lists, that (almost certainly) vary each time you execute this statement, you might want to use an object as shown in the doc.
Here is a similar example:
with connection.cursor() as cursor:
try:
sql = """select last_name from employees where employee_id in (select * from table(:1))"""
type_obj = connection.gettype("SYS.ODCINUMBERLIST") # use SYS.ODCIVARCHAR2LIST, SYS.ODCIDATELIST for varchar2 & dates
obj = type_obj.newobject()
obj.extend([100,101,103]) # these are the IN list values.
res = cursor.execute(sql, [obj]).fetchall()
print(res)
For smaller lists, then try one of the more performant other methods. Your code could implement several solutions, and choose the most appropriate one depending on the size of the IN list.