The Heap
Does not refer to the same thing as the heap ADT. It is implemented with a linked list.
A heap is a data structure that manages memory so it can be allocated/deallocated at any time
Unlike the stack, which stores static memory (the size of which we can determine at compiler time) and is much faster, the heap is for dynamic memory (we can only determine the size at runtime). Because of this, a block of heap memory may be anywhere in the allocated heap area. Hence, while allocating new blocks on the heap is easy, the real challenge lies in re-using deallocated memory.
For CS241E, we give
Book: https://en.wikipedia.org/wiki/The_C_Programming_Language
We use the "next" pointer to create a linked list of free blocks.
We will use these functions to manipulate our heap:
def size(block) = deref(block)
def next(block) = deref(block + 4)
def setSize(block, size) = assignToAddr(block, size)
def setNext(block, next) = assignToAddr(block + 4, next)
Initialization
def init(heapStart: Int, heapSize: Int) = {
val block = heapStart + 8
setSize(heapStart, 8)
setNext(heapStart, block)
setSize(block, heapSize - 8)
setNext(block, heapStart + heapSize)
}
Heap with Explicit Deallocation
Allocation
We want to allocate a block of wanted
bytes of usable space
To allocate:
- Keep following linked list until we find a large enough block
- If the block is too big, just split it into 2
- Unlink block from linked list
- Return its address
def malloc(wanted: Int): Int = {
def find(previous: Int): Int = {
val current = next(previous)
// If the current free block is too small, try the next one
if (size(current) < wanted + 8) {
find(current)
} else {
// If block is 2 bytes larger than we need, split it
if (size(current) >= wanted + 16) {
val newBlock = current + wanted + 8
setSize(newBlock, size(current) - (wanted + 8))
setNext(newBlock, next(current))
setSize(current, wanted + 8)
setNext(previous, newBlock)
} else {
// Remove current from free list
setNext(previous, next(current))
}
current
}
}
find(heapStart)
}
After malloc(8)
with heapStart = 100
:
Deallocation
Deallocate:
- Traverse linked list of free blocks to find last free block before address
- Coalesce (merge) block address if preceding and/or following blocks are free
- Insert block address into free list
def free(toFree: Int) = {
def find(previous: Int) = {
val current = next(previous)
if (current < toFree) {
find(current)
} else {
val shouldCoalescePrev =
previous + size == toFree && previous > heapStart
val shouldCoalesceCur =
toFree + size(toFree) == current && current < heatStart + heapSize
if (!shouldCoalescePrev && !shouldCoalesceCur) { // Just append to linked list
setNext(toFree, current)
setNext(previous, toFree)
} else if (shouldCoalescePrev && !shouldCoalesceCur) { // Merge with previous
setSize(previous, size(previous) + size(toFree))
} else if (!shouldCoalescePrev && shouldCoalesceCur) { // Merge with next
setSize(toFree, size(toFree) + size(current))
setNext(toFree, next(current))
setNext(previous, toFree)
} else { // Merge with previous and next
setSize(previous, size(previous) + size(toFree) + size(current))
setNext(previous, next(current))
}
}
}
find(heapStart)
}
note: current
"overshoots" toFree
by one free block.
Fragmentation and Compaction
Fragmented Heap
A heap if fragmented when free space is split into many small non-adjacent blocks
Problem: our heap could be mostly free, but we can't even allocate a decently small block.
Compaction
Compaction de-fragments the heap by "pushing" all used blocks as far "up" as possible
- Copy blocks in use to beginning of heap
- Update pointers to new addresses
- Need to identify pointers, so we need sound types.
In C and C++, we can't move blocks and de-fragment the heap, because the types are not sound (you can do pointer arithmetic and store pointers as integers).
In LACS or Go, a variable is pointer if and only if it holds addresses of a chunk.
For our chunks, we store the chunk size, # of pointers, pointers, then non-pointer words.
With compacting, we don't have to worry about trying to find new space. We can just keep adding blocks to the next free area until we reach the end (like with the stack, using a heap pointer), then compact the heap. Additionally, we don't need a linked list of free blocks. Instead, we can just mark free blocks as "free".
Compaction takes a long time, so in a real-time system for example, it can cause unexpected spikes in latency. However for most systems, this is acceptable, and the average cost per allocation/deallocation is usually lower with compaction.
Automatic Garbage Collection
A garbage collector takes compaction a step further. Instead of explicitly deallocating blocks for use, we can automatically see if a block is free or not, and then compact.
How do we tell if a block needs compaction?
live
Algorithm: Cheney's Copying Garbage Collector.