Programming from the beginning Part IV: Memory

Part III

Don’t “Forget” About Memory

All memory storage requires management. That management takes the form of allocation and deallocation of memory, tracking and optimising those allocations.

Memory fragmentation poses a real problem that requires addressing in all languages. The closer to the hardware (lower the level of the language) the more it becomes the programmer’s problem to resolve.

Why memory becomes fragmented; because there are at least two factors at play the scope and the size; i.e. how long the allocation is going to stand and how big the allocation is.

If we think about strings, remember are immutable, so any action creates a brand new string. Each new string requiring an additional memory allocation?

When we deallocate memory it leaves a hole. If we have a mix of long scope and short scope allocations memory can end up looking like swiss cheese. This eventually results in a situation where the memory required by an allocation request is larger than the memory available in one continuous block. This is how out of memory errors occur even when memory isn’t full.

In the above block, the largest contiguous allocation we could manage would be 3 blocks. Despite there being 50 blocks free.

Pointers, Structures and Objects

Pointers help us solve some of these problems, especially if we can repoint them. This allows us to manage the fragmentation of free memory by moving the allocations and updating pointers. Doing this to the above example would give us the below.

They also help with building more complex collections of data, including lists and queues. (described below)

Represented as a memory address holding a specific type of data; i.e. they “point” to another location where the real data resides. Pointers exist because it is more efficient to pass a reference than copying lots of values around.

Structures represent a collection of one or more fixed-length data types stored in a contiguous memory block. By defining the content of the structure, in turn, fixes the start and end of each of its elements. However, the creates a requirement for a single contiguous block of memory, in order to store the structure.

Objects are collections of pointers to data or executable code. Each element in the object is a pointer to a method (executable code) or a specific data type. Which in turn could be another object or a structure (as above). To store an object we only need enough contiguous memory to store the pointers for the high-level elements. This means the elements of that object may be “littered” across several blocks of memory.

Stack and Heap

Another way to mitigate short vs long scope allocations causing problems is to separate the scopes and respective memory locations. Putting short-lived data in one place and longer-lived data in another. This reduces the swiss cheese effect described above. The stack and the heap achieve this separation, to some degree, placing longer-term allocations on the heap leaving the stack for short-term allocations including the executing code. Destruction of allocations in this short-lived space occurs once the code using them completes its execution. As a loose “rule of thumb”, structures reside on the stack. Whereas “objects” i.e., things that use more complex associations between elements such as pointers reside on the heap. However, some people give too much weight to this premise, though true to a point, it’s really not as clear cut as it seems.

In some high-level languages, that have an abstraction, such as C# and Java which have the CLR and JVM respectively. The management of low-level memory and allocations, managed away by the abstraction, rarely crosses the programmer’s mind. Yes, you can still blow the stack or run out of memory. However, you have to be quite sloppy to do it. The garbage collection in these languages takes care of most of the heavy lifting without you even pausing for contemplation.

Queues and Stacks

Queues and stacks represent ordered collections of objects or structures, and they both use pointers in a similar way.

A queue (Last In Last Out “LILO“) has a head and a tail, and each element has two pointers one to the current item and one to the next item. When using a queue you would read (DEQUEUE) from the head and add (ENQUEUE) to the tail. When an item has dequeued the pointer for the head moves to the next item. Adding an item points the current tail to the new item (as the next item) and the tail moves along to the new item.

A stack (First In First Out “FIFO“) works in much the same way except items are added (PUSHED) and removed (POPED) from the head.