Home > front end >  Oracle In-Clause Parameter Padding
Oracle In-Clause Parameter Padding

Time:04-12

Suppose an in-clause can accept a varying number of bind parameters. In this case, databases can have a hard time caching the query. Basically, each time a different number of bind parameters is passed, the query needs to be hard parsed. Enter "parameter padding". Parameter padding will take an in-clause and increase the number of binds to the closest 2^x number of binds.

Examples:

  • select count(*) from user where id in (1, 2, 3) becomes select count(*) from user where id in (1, 2, 3, 3)
  • select count(*) from user where id in (1, 2, 3, 4) remains select count(*) from user where id in (1, 2, 3, 4)

And now these two queries can share the same cached plan.

Question: Is there a logical reason for 2^x binds? Why not 3^x or 5^x so that even fewer hard parses are required? This would be especially helpful in queries that contain multiple in-clauses with varying binds.

The specific database in question is Oracle 12c. Using stats for a query that has an in-clause of in (1, 2, 3, 3) shows that the duplicate values do not appear in the execution plan. Furthermore, using stats for a query requiring 30 binds runs just as efficiently when using an in-clause of the exact 30 values needed OR using an in-clause with 100 values where the last value appears 70 more times.

CodePudding user response:

There's a tradeoff:

If you choose, say, 5^x, then for 7 parameters your IN-list will have 25 members, instead of just 8. The query will then take longer to run - the fact that the tail values are all equal won't help.

Note that your example of an explain plan for the IN-list of (1,2,3,3) is irrelevant. That has hard-coded values, not bind variables. The relevant example is (:bind1, :bind2, :bind3, :bind4); when the query is parsed, the optimizer can't assume that :bind3 will always equal :bind4 (for the obvious reason that that's not even true in general).

2^x is usually a good tradeoff between "how many hard parses to allow" and "how fast the queries will be". Otherwise you could just use a single query, with 1000 parameters (the max allowed) - why even have more than ONE such query?

  • Related