Home > Back-end >  memset of allocated memory after std::vector::reserve
memset of allocated memory after std::vector::reserve

Time:12-04

There is a closely related question about this topic already here, but the question was highly contested and the related discussion was a bit confusing to me. So is the following thinking correct?

My situation is the following: I have a data structure that uses chunks to store data. I want to preallocate a large number of chunks using something like std::vector<ChunkT> myChunks; myChunks.reserve(1000000); and fetch a new chunk without allocation whenever needed using ChunkT* newChunk = &myChunks.emplace_back();. I want the new chunk to be zero initialized but I rather prefer to do this initialization using a memset directly after reserving the memory instead of initializing one Chunk at a time once I fetch it. Provided that ChunkT is POD like e.g. struct {size_t keys[512]; size_t values[512];}; I was not sure about the following:

  1. is it safe to 0-initialize the memory using memset after reserve?
  2. is it guaranteed that I still have 0-initialized memory in the example of ChunkT being struct {size_t keys[512]; size_t values[512];}; after fetching my chunk with ChunkT* newChunk &myChunks.emplace_back()?

Regarding 1.) a user in the linked question argued that it would be unsafe because the standard does not guarantee what the std::vector implementation might be doing with the reserved memory (e.g. using it for internal bookkeeping). wpzdm argued that nothing surprising could be going on with the reserved memory. Reading all the related discussion I am thinking now that accessing the objects in only reserved memory is safe, since their life time already started (because they are POD and allocated by the vector's allocator) and so they are perfectly valid objects. However their content is not guaranteed at any point until the memory becomes part of the "valid" range e.g. through emplace_back, because the standard does not say that the vector implementation must not modify the reserved range (so 2.) is No?). But also the vector implementation cannot rely on the content of those reserved object since we are allowed to access and change them as we see fit. So neither "internal bookkeeping" nor setting debug flags to detect out-of-bounds accesses outside the "valid" but inside the reserved range or anything alike would be strictly standard-conforming because it could cause disallowed side effects. So only a malicious or non conforming compiler would be modifying the reserved range?

If I change ChunkT to struct {size_t keys[512]={0}; size_t values[512]={0};}; then content of the object after emplace_back is guaranteed, but this time because initialization takes place through construction. Also, now it would be undefined behaviour to access the only reserved memory because the lifetimes of the objects have not yet begun.

CodePudding user response:

  1. is it guaranteed that I still have 0-initialized memory in the example of ChunkT being struct {size_t keys[512]; size_t values[512];}; after fetching my chunk with ChunkT* newChunk &myChunks.emplace_back()?

emplace_back() value initialises the object, so the zero-initialisation is guaranteed regardless of what the memory contained before the object was created.

CodePudding user response:

  1. is it safe to 0-initialize the memory using memset after reserve?

Maybe it works, but you'd better not. Accessing a nonexistent element through [] is UB.

  1. is it guaranteed that I still have 0-initialized memory in the example of ChunkT being struct {size_t keys[512]; size_t values[512];}; after fetching my chunk with ChunkT* newChunk &myChunks.emplace_back()?

Yes. In your situation, what emplace_back() do is construct a Chunk via placement-new, and POD-classes will be zero-initialized. ref: POD class initialized with placement new default initialized?

So, you don't have to worry about memset the allocated memory to zero. Please correct me if I am wrong.

  • Related