I am using std::aligned_alloc() in one of my projects to allocate aligned memory for optimized PCIe read/write.
When I read about aligned_alloc from here, it says:
Defined in header
<stdlib.h>
void *aligned_alloc( size_t alignment, size_t size );
Passing a size which is not an integral multiple of alignment or an alignment which is not valid or not supported by the implementation causes the function to fail and return a null pointer (C11, as published, specified undefined behaviour in this case, this was corrected by DR 460). Removal of size restrictions to make it possible to allocate small objects at restrictive alignment boundaries (similar to alignas) has been proposed by n2072.
From what I understood, now the only valid restriction is that the alignment parameter should be a valid alignment value (and a power of two). Fine. To get a valid alignment value, we can get the value of max_align_t.
[My System RAM - 128 GB. 2 CPUs - AMD EPYC 7313 16-Core Processor. It is a server machine running Centos7 latest]
I now have a couple of doubts here:
In my system, for almost every combination of 'alignment value' and 'size', aligned_alloc() returns success. (Unless the alignment is some huge value). How is this possible? Is it implementation specific?
My code snippet:
```
void* a = aligned_alloc(64, 524288000);
if(a == nullptr)
std::cout << "Failed" << std::endl;
else
std::cout << "Success" << std::endl;
```
Here is what values I tried for aligned_alloc() and their results:
aligned_alloc(64, 524288000) - Success
aligned_alloc(4096, 524288000) - Success
aligned_alloc(64, 331) - Success
aligned_alloc(21312323, 889998) - Success
aligned_alloc(1, 331) - Success
aligned_alloc(0, 21) - Success
aligned_alloc(21312314341, 331); - Success
aligned_alloc(21312312243413, 331); - Failed
Please do comment if any more info is needed to clear the question. Thanks
CodePudding user response:
Glibc has this line of code https://github.com/lattera/glibc/blob/master/malloc/malloc.c#L3278
/* Make sure alignment is power of 2. */
if (!powerof2 (alignment))
{
size_t a = MALLOC_ALIGNMENT * 2;
while (a < alignment)
a <<= 1;
alignment = a;
}
How is this possible?
(Weeeeellll, that that something is in specification doesn't restrict reality.) There is just code that makes it possible. If you want to know what exactly happens, inspect the source code - glibc is open-source.
Centos7 "latest" is quite old, I see glibc 2.17 which is from year 2012 ( https://centos.pkgs.org/7/centos-x86_64/glibc-2.17-317.el7.x86_64.rpm.html and https://sourceware.org/glibc/wiki/Glibc Timeline ). DR460 is from 2014. For that glibc that DR does not exist, and we can say that glibc followed C11 standard and the behavior is undefined.
Is it implementation specific?
"Implementation specific" is a... specific term used by standards to specify the behavior. In C11 the behavior is undefined. in C17 the behavior is that aligned_alloc
should fail with invalid alignment. In real life, everything is implementation specific, as glibc comes with the implementation of aligned_alloc
.
If you are wondering not about alignment, but why you can specify a size greater than your available RAM, then welcome to virtual memory. Malloc allocates memory more than RAM
CodePudding user response:
Looks like you found a bug. The libc doesn't seem to fail as specified by the standard but just gives you memory instead. Personally I don't see anything wrong with 331 bytes aligned to a 64 byte boundary. It's just not something C/C ever has because a struct with 64 byte alignment always has padding at the end to a multiple of 64.
None of your allocations use a lot of ram, half a gig at most. So you are not running out of memory.
As for why insanely huge alignment works?
If the code isn't stupid it will use mmap()
with a fixed address to allocate memory to the closest page. So no matter the alignment you should never have more than 2 * 4095 bytes wasted (assuming 4k pages, could be 16k or 64k too).
And as KamilCuk pointed out: https://github.com/lattera/glibc/blob/master/malloc/malloc.c#L3278
/* Make sure alignment is power of 2. */
if (!powerof2 (alignment))
{
size_t a = MALLOC_ALIGNMENT * 2;
while (a < alignment)
a <<= 1;
alignment = a;
}
Seems like the glibc will round up the alignment to the next power of 2. So all your huge odd numbers would become multiples of page sizes and waste even less. Although how that fullfilles the standard I don't know.
As for your last case: The address space of the architecture is only so big. You can see that in /proc/cpuinfo
under Linux:
address sizes : 43 bits physical, 48 bits virtual
Relevant here is the 48 bits virtual
. That goes from -128EB - 128EB or 0 - 128EB or (16Gozillabyte - 128EB) to 16Gozillabyte depending on how you view the address space (signed or unsigned addresses). Either way user space has a maximum of 128EB to work with. Your last alignment is ~19TB, or after rounding up 32TB. Looks like glibc isn't smart enough to mmap that properly. That's plenty small enough to work with.