My background knowledge:
To my understanding, to be allocated/used properly, memory must be contiguous in the virtual address space, but doesn't have to be actually contiguous in the physical memory or even the physical memory address space.
This would kind of suggest that the way that the memory address translations from physical to virtual work is that it is a series of mappings where any free memory blocks in the physical memory address space get assigned to a corresponding area in the virtual memory address space.
Setup to the question:
This answer, in response to a question about freeing memory in C, refers to memory fragmentation, a scenario in which (in this specific case) repeatedly allocating and freeing memory could result in there existing enough OS-allocated memory for future process usage, but it can't be used because it isn't contiguous in the free store linked-list.
If we could keep plucking memory blocks out of the OS-allocated memory that are not in use even if they are dispersed (not contiguous), wouldn't that fix the problem of memory fragmentation? To me, that seems exactly the same way that the physical to virtual memory address translations work, where non-contiguous blocks are utilized as if they were contiguous.
So, to repeat my question, why does memory have to be contiguous?
CodePudding user response:
Two issues here:
It is necessary for each object to occupy a contiguous region in virtual memory, so that indexing and pointer arithmetic can be done efficiently. If you have an array
int arr[5000];
, then a statement likearr[i] = 0;
boils down to simple arithmetic: the value ofi
is multiplied by 4 (or whateversizeof(int)
may be) and then added to the base address ofarr
. Addition is very fast for a CPU. If the elements ofarr
weren't located at consecutive virtual addresses, thenarr[i]
would require some more elaborate computation, and your program would be orders of magnitude slower. Likewise, with contiguous arrays, pointer arithmetic likeptr
really is just addition.Virtual memory has granularity. Every mapping of a virtual to a physical address requires some metadata to be kept somewhere in memory (say 8 bytes per mapping), and when this mapping is used, it is cached by the CPU in a translation lookaside buffer which requires some silicon on the chip. If every byte of memory could be mapped independently, your mapping tables would require 8 times more memory than the program itself, and you'd need an immense number of TLBs to keep them cached.
So virtual memory is instead done in units of pages, typically 4KB or 16KB or so. A page is a contiguous 4K region of virtual memory that is mapped to a contiguous 4K region of physical memory. Thus you only need one entry in your mapping tables (page tables) and TLB for the entire page, and addresses within a page are mapped one-to-one (the low bits of the virtual address are used directly as the low bits of the physical address).
But this means that fragmentation by sub-page amounts can't be fixed with virtual memory. As in Steve Summit's example, suppose you allocated 1000 objects of 1KB each, which were stored consecutively in virtual memory. Now you free all the odd-numbered objects. Nominally there is now 500 KB of memory available. But if you now want to allocate a new object of size 2KB, none of that 500 KB is usable, since there is no contiguous block of size 2KB in your original 1000 KB region. The available 1KB blocks can't be remapped to coalesce them into a larger block, because they can't be separated from the even-numbered objects with which they share pages. And the even-numbered objects can't be moved around in virtual memory, because there may be pointers to those objects elsewhere in the program, and there is no good way to update them all. (Implementations that do garbage collection might be able to do so, but most C/C implementations do not, because that comes with substantial costs of its own.)
CodePudding user response:
So, to repeat my question, why does memory have to be contiguous?
It doesn't have to be contiguous.
To avoid fragmentation within a block of OS-allocated memory page; you need ensure that the memory being allocated from "heap" (e.g. using "malloc()
") as at least as large as a block of block of OS-allocated memory. This gives 2 possible options:
a) change the hardware (and OS/software) so that a block of OS-allocated memory is much smaller (e.g. maybe the same size as a cache line, or maybe 64 bytes instead of 4 KiB). This would significantly increase the overhead of managing virtual memory.
b) change the minimum allocation size of the heap so that it's much larger. Typically (for modern systems) if you "malloc(1);
" it rounds the size up to 8 bytes or 16 bytes for alignment and calls it "padding". In the same way, it could round the size up to the size of a block of OS-allocated memory instead and call that "padding" (e.g. "malloc(1);
" might have 4095 bytes of padding and cost 4 KiB of memory). This is worse than fragmentation because padding can't be allocated (e.g. if you did "malloc(1); malloc(1);
" those allocations couldn't use different parts of the same block of OS-allocated memory).
However; this only really applies to small allocations. If you use "malloc();
" to allocate a large amount of memory (e.g. maybe 1234 KiB for an array) most modern memory managers will just use blocks of OS-allocated memory, and won't have a reason to care about fragmentation for those large blocks.
In other words; for smaller allocations you can solve fragmentation in the way you've suggested but it would be worse than allowing some fragmentation; and for larger allocations you can solve fragmentation in the way you've suggested and most modern memory managers already do that.