What is the reasoning for why table names can not be susbstituted in PDO prepared statements?-CodePudding

This works as expected:

$st = $db->prepare('SELECT COUNT(1) WHERE EXISTS (SELECT * FROM ' . $table_name . ')');
$st->execute(array());

This throws a syntax error:

$st = $db->prepare('SELECT COUNT(1) WHERE EXISTS (SELECT * FROM :tblname )');
$st->execute(array('tblname' => $table_name));

As per the comments, "data can only be bound when it can be represented as a numeric or a quoted string literal. A table name can neither be a numeric or string literal."

What is the reasoning behind this?

CodePudding user response：

In SQL, table, column, and schema names cannot be provided as parameters to prepared statements. It's not a PDO thing, it's a language restriction of SQL itself.

Many RDBMSs have semantics for preparing statements once and then using the prepared statements multiple times with different data. This is efficient because the RDBMS software works out the query plan – the operations it must do to satisfy the query – just once, and then reuses it with different data. If table or column names were variables in prepared statements the RDBMS software could not do that. PDO supports multiple RDBMSs so it enforces this language feature.

Prepared statements are also a security feature. When you provide parameters and use the statement, the data types of parameters get checked carefully. That would be much harder if table or column names could be parameters.

And, it's the way SQL was defined a generation ago. Data in RDBMSs lasts for decades.

CodePudding user response：

Parameters are not just a string substitution. That's actually what makes them good at preventing SQL injection vulnerabilities.

(Well, in some clients, they emulate query parameters using string substitution. PDO has this behavior if you set the PDO::ATTR_EMULATE_PREPARES attribute. But I'll talk about true parameters.)

The reason that parameters are not string substitution is that they are combined with the SQL query after it has been parsed by the database server. The parsing step validates your SQL syntax is correct, and validates that the tables and columns you have named in the query in fact exist.

The only thing that can be omitted from the parsing step are constant values in expressions. The parser sees one of the parameter placeholders and knows that that token acts as if you had used a single scalar constant value. Therefore it can proceed with the syntax validation.

The following example is not valid SQL syntax because you can't SELECT from a constant string:

SELECT COUNT(1) WHERE EXISTS (SELECT * FROM 'mytable' )

But that's how the syntax will be interpreted if the table name is passed as a parameter. The SQL engine doesn't know the exact value of the parameter, but it knows it should be treated as a string value.

This way, the query can be parsed once (during the prepare() call), and then once parsed, you can execute the prepared query multiple times, plugging in different values for each execution (the execute() call). Those parameter values cannot change the logic of the query, only the constant values.

And that's the benefit for protecting against SQL injection. SQL injection occurs when dynamic content is combined with a query before it is parsed, so the content can change the syntax and therefore the logic of the query, making it do something you didn't intend. But if parameters are combined after parsing, they can't affect the logic.