Why do GCC and Clang produce different output with this conforming C code:
int (puts) (); int (main) (main, puts) int main;
char *puts[(&puts) (&main["\0April 1"])]; <%%>
Neither compiler produces any warning or error even with -Wall -std=c18 -pedantic
, but the program produces no output when built with GCC but prints the current date when built with Clang.
CodePudding user response:
Why do GCC and Clang produce different output with this conforming C code:
int (puts) (); int (main) (main, puts) int main; char *puts[(&puts) (&main["\0April 1"])]; <%%>
In the first place, it is conforming code, though it does make use of a variable-length array, which is an optional language feature in C11 and C17. Some of the obfuscations are
- use of the obscure digraphs
<%
and%>
, which mean the same thing as{
and}
, respectively. - parenthesizing the function identifiers in function declarations
- a forward declaration of function
puts
that is not a prototype - a K&R-style definition of function
main
- with a VLA parameter
- whose dimension expression contains a function call
- and a reference to another parameter
- with a VLA parameter
- use of unconventional identifiers for the parameters to function
main()
- use of identifiers (
puts
andmain
) in declarations of an object and a function, respectively, with the same identifier - use of the identifier
main
for something more than the program's entry-point function - inversion of the conventional order of the operands of the indexing operator (
[]
)- plus, indexing a sting literal
- calling a function via an explicit function pointer constant expression
- A string literal with an explicit null character within
- Unconventional placement (and omission) of line breaks
A less obfuscated equivalent would be
int puts();
int main(
int argc,
char *argv[ puts("\0April 1" argc) ]
) {
}
But the central question about the difference in behavior between the version compiled with GCC and the one built with Clang comes down to whether the expression for the size of the VLA function parameter is evaluated at runtime.
The language spec says that when a function parameter is declared with array type, its type is "adjusted" to the corresponding pointer type. That applies equally to complete, incomplete, and variable-length array types, but the spec does not explicitly say that the expression(s) for the dimension(s) are not evaluated. It does specify that expressions go unevaluated in certain other cases, and it even makes an exception to such a rule in the case of sizeof
expressions involving VLAs, so the omission in this case could be interpreted as meaningful.
That makes a difference only for parameters of VLA type, because only for those can evaluation of the dimension expression(s) produce side effects on the machine state, including, but not limited to, observable program behavior.
GCC does not evaluate the VLA parameter's size expression at runtime, and I am inclined to take this as conforming to the intent of the standard. As a result, the GCC-compiled program does nothing but exit with status 0.
Clang does evaluate the VLA parameter's size expression at runtime. Although I disfavor this interpretation of the spec, I cannot rule it out. When it does evaluate the size expression, it uses the passed value of the first parameter. When the program is run without arguments, then the first parameter has value 1, with the result that the standard library's puts
function is called with a pointer to the 'A'
in "\0April 1"
.
CodePudding user response:
int (puts) ();
int (main) (main, puts)
int main;
char *puts[(&puts) (&main["\0April 1"])];
{
}
Somebody's got a compiler bug; I'm just not sure who anymore. I don't understand why any compiler would emit code to evaluate the size parameter of a VLA as an argument.
The clang output is rather bizarre. For it to work, it would have had to find main
in the function's scope but puts
in the global scope despite having already encountered the declaration for puts
. Normally, you can access a variable in its own declaration.
If somebody did this in production code my answer would be rather: "Stop using K&R function definitions."