Home > other >  gcc-11 incorrect may be used uninitialized that seems hard to avoid
gcc-11 incorrect may be used uninitialized that seems hard to avoid

Time:11-02

I found a particular usage pattern that seems completely ok and no compiler has ever complained on before. Now it raises a warning with gcc-11: A close to minimal example is below. Save as t.c and compile using gcc-11 -O2 -Wall -c t.c.

#include <stdlib.h>
#include <string.h>

extern void f(const char *s);

void
t(int len, char *d)
{ char tmp[500];
  char *s, *o;
  int i;

  if ( len <= sizeof(tmp) )
    s = tmp;
  else if ( !(s=malloc(len)) )
    return;

  for(o=s,i=0; i < len; i  )
    *o   = d[i] 1;

  f(s);
//  i = strlen(s);
  if ( s != tmp )
    free(s);
}

Compiling results in:

gcc-11 -O2 -Wall -c t.c
t.c: In function ‘t’:
t.c:20:3: warning: ‘s’ may be used uninitialized [-Wmaybe-uninitialized]
   20 |   f(s);
      |   ^~~~
t.c:4:13: note: by argument 1 of type ‘const char *’ to ‘f’ declared here
    4 | extern void f(const char *s);
      |             ^
t.c:20:3: warning: ‘tmp’ may be used uninitialized [-Wmaybe-uninitialized]
   20 |   f(s);
      |   ^~~~
t.c:4:13: note: by argument 1 of type ‘const char *’ to ‘f’ declared here
    4 | extern void f(const char *s);
      |             ^
t.c:8:8: note: ‘tmp’ declared here
    8 | { char tmp[500];
      |        ^~~

Now there are some observations

  • calling strlen(s) instead of f(s) does not result in a warning. Note that both accept const char* (checked in /usr/include/string.h)
  • If I remove the const in the declaration of f(), the problem vanishes too.
  • Calling using f((const char*)s) doesn't help either.
  • Initializing s as in char *s = NULL does not help either (and I do not want that anyway as it hides correct warnings about not initializing a variable).

I have learned that claiming something is a GCC bug is typically rightfully proven to be wrong. So I first check here what I'm missing.

edit As I cannot put code in the comment, here is code that disproves part of the claim. This compiles fine:

extern void f(const char *__s);

void
t(int len, char *d)
{ char tmp[500];
  char *s=tmp, *o;
  int i;

  for(o=s,i=0; i < len; i  )
    *o   = d[i] 1;

  f(tmp);
}

CodePudding user response:

The compiler is correct, although the English phrasing is imperfect.

Assuming len is always positive, a fix is to insert if (len <= 0) __builtin_unreachable(); in the function. This tells the compiler that len is always positive, which means some data must be written to the memory s points to before f is called.

When the compiler says “‘s’ may be used uninitialized,” it does not mean that the value of s may be used but rather that what it points to may be used, and that pointed-to-memory is not initialized. Note that s is passed to a function that takes a const char *s, suggesting that function will not modify the data s points to and therefore expects it to contain data already. The C standard does not strictly require this; as long as the pointed-to-memory was not defined with const, f is allowed to reconvert the pointer to its original char * and modify the data there, but the implication of the parameter declaration is that it will not.

We can confirm this by changing the body of the function to:

char tmp[500];
f(tmp);

Then the compiler complains “warning: 'tmp' may be used uninitialized.” It is patently obvious that tmp, as passed to the function, is not uninitialized; it will be the address of the array. So the compiler must be warning that it is the contents of the array that may be used uninitialized.

Note that, while the loop starting with for(o=s,i=0; i < len; i ) superficially appears to initialize the data that s points to, it does not if len is zero. And since s is passed to f without len, f has no way of knowing nothing is in s (except by some side channel such as using external objects). So presumably f reads at least some data in s in every call.

Here is a smaller example:

#include <stdlib.h>

extern void f(const char *s);

void t(int len)
{
    char *s = malloc(len);
    f(s);
    free(s);
}

Presumably, len is always positive. To tell GCC this, insert this line in the function:

if (len <= 0) __builtin_unreachable();

That results in no new code generation, but the warning vanishes. (Actually, the generated code gets smaller, partly because the compiler can enter the for loop without first testing i < len.)

CodePudding user response:

It was explained in the gcc-11 release notes

https://www.gnu.org/software/gcc/gcc-11/changes.html

maybe-uninitialized is now default included

https://gcc.gnu.org/onlinedocs/gcc-11.1.0/gcc/Warning-Options.html#index-Wmaybe-uninitialized

compiler emits a warning if it cannot prove the uninitialized paths are not executed at run time.

Even if it is clear to the author, the onus is on the compiler to prove data is initialized, otherwise it sets a warning.

-Wmaybe-uninitialized is triggered with -Wall

The real question becomes- why not just initialize the data?

This is example code but it looks like a vulnerability waiting for an exploit given that tmp can be any arbitrary non-null memory, presumably len is there because there's no guarantee of a null terminator (or embedded nulls may be present), but len isn't passed through to f.


#include <stdlib.h>
#include <string.h>

extern void f(const char *s);

void
t(int len, char *d)
{ char tmp[500] = {0};  //ensure nulls
  char *s = tmp;
  char *o = s;
  int i;

  if ( len <= sizeof(tmp) )
    s = tmp;
  else if ( !(s=malloc(len)) )
    return;
  memset(s len, 0, 1); // ensure a trailing null
  for(i=0; i < len-1; i  ) // leave that null there at least
    *o   = d[i] 1;

  f(s);
//  i = strlen(s);
  if ( s != tmp )
    free(s);
}

gcc compiles it with no warnings

gcc -O2 -Wall -c t.c
  • Related