Suppose an in-clause can accept a varying number of bind parameters. In this case, databases can have a hard time caching the query. Basically, each time a different number of bind parameters is passed, the query needs to be hard parsed. Enter "parameter padding". Parameter padding will take an in-clause and increase the number of binds to the closest 2^x number of binds.
Examples:
select count(*) from user where id in (1, 2, 3)
becomesselect count(*) from user where id in (1, 2, 3, 3)
select count(*) from user where id in (1, 2, 3, 4)
remainsselect count(*) from user where id in (1, 2, 3, 4)
And now these two queries can share the same cached plan.
Question: Is there a logical reason for 2^x binds? Why not 3^x or 5^x so that even fewer hard parses are required? This would be especially helpful in queries that contain multiple in-clauses with varying binds.
The specific database in question is Oracle 12c. Using stats for a query that has an in-clause of in (1, 2, 3, 3)
shows that the duplicate values do not appear in the execution plan. Furthermore, using stats for a query requiring 30 binds runs just as efficiently when using an in-clause of the exact 30 values needed OR using an in-clause with 100 values where the last value appears 70 more times.
CodePudding user response:
There's a tradeoff:
If you choose, say, 5^x, then for 7 parameters your IN-list will have 25 members, instead of just 8. The query will then take longer to run - the fact that the tail values are all equal won't help.
Note that your example of an explain plan for the IN-list of (1,2,3,3) is irrelevant. That has hard-coded values, not bind variables. The relevant example is (:bind1, :bind2, :bind3, :bind4); when the query is parsed, the optimizer can't assume that :bind3 will always equal :bind4 (for the obvious reason that that's not even true in general).
2^x is usually a good tradeoff between "how many hard parses to allow" and "how fast the queries will be". Otherwise you could just use a single query, with 1000 parameters (the max allowed) - why even have more than ONE such query?