I recently found a piece of code online which looked a little like this:
#include <stdio.h>
#include <string.h>
int main() {
float m[10];
memset(m, 0, 20);
}
I also saw a snippet where it was like this which I believe to be correct:
memset(m, 0, sizeof m);
When trying to print out all the values of the first example using this snippet:
for (int i = 0; i < 20; i ) {
printf("%f, \n", m[i]);
}
It produces an output like this:
0.000000,
0.000000,
0.000000,
0.000000,
0.000000,
0.000000,
0.000000,
0.000000,
0.000000,
0.000000,
-0.000000,
-4587372491414098149376.000000,
-0.000013,
0.000000,
0.000000,
0.000000,
0.000000,
0.000000,
0.000000,
0.000000
Where the values change on recompilation.
Now I have a few questions:
- Why can memset write to a
float
-array more than what was allocated and why can't you do that with achar
-array? - Why is it so inconsistent?
- Why does changing the second value to value of
memset
to1
for example not change the output?
CodePudding user response:
Why can memset write to a
float
-array more than what was allocated and why can't you do that with achar
-array?
memset(m, 0, 20);
, as the question originally showed, does not write more than was allocated. Commonly, float
is four bytes in C implementations, so float m[10];
allocates 40 bytes, and memset(m, 0, 20);
writes 20.
In the new code, memset(m, 0, sizeof m);
writes just as many bytes to m
as it has, no fewer and no more.
If memset
were asked to write more, the reason you can try to do that is C implementations generally do not safety check operations, and the C standard does not require them to.
Why is it so inconsistent?
There is nothing inconsistent. memset
wrote zeros to the first 20 bytes of m
, and that is the encoding for floating-point zero, in the format commonly used for float
(IEEE-754 binary32, also called “single precision”).
The bytes after that were not written, so printing them uses uninitialized data. The C standard says the values of uninitialized objects are not determined. A common result is the program uses whatever happened to be in the memory already. That may be zeros, or it may be something else.
However, with the loop for (int i = 0; i < 20; i )
, you go beyond the 10 elements that are in m
. Then the behavior of accessing m[i]
is not defined by the C standard. As above, a common result is the program accesses the calculated memory and uses whatever happens to be there. However, a variety of other behaviors are also possible, including crashing due to an attempt to access unmapped memory or the compiler replacing the undefined code with alternate code during optimization.
Why does changing the second value of
memset
not change the output?
It will, depending on what you change it to. Some values for the byte may result in float
values that are so small they are still printed as “0.000000”. For example, if bytes are set to 1, making the 32 bits 0x01010101
in each float
, they represent a float
value of 2.36942782761723955384693006253917004604239556833255136345174597127722672385008451101384707726538181•10−38.
If you use 64 for the second argument to memset
, the bits will be set to 0x40404040
, which encodes the value 3.0039215087890625, so “3.003922” will be printed.