The Stack
Download review sheet for Stack [PDF] (403KB)
Stack (memory)
Memory space in RAM provided by the OS to store data for functions. The Stack contains one or more Stack Frames.
The program itself is NOT stored in the Stack, but rather in low memory just above the OS. The Program Counter (PC) points to the top of the program, and each instruction of your program is located at an offset from PC. When you use display/i $pc, you are telling GDB to do x/i everytime GDB stops, starting from the address pointed to by PC.
Registers are also NOT stored in the Stack, as registers are located directly in the processor itself. In comparison, the Stack is stored in RAM.
Stack Frame
Each Stack Frame corresponds to one function call. A Stack Frame contains a function's parameters, local variables, and return values. When a function is called, a Stack Frame is pushed onto the stack. When a function returns, its Stack Frame is popped from the Stack. The Stack is named as such because it is a LIFO (last-in-first-out) stack. Note that even if you call the same function twice, these are two separate function calls, resulting in two completely separate Stack Frames.
Say you called functionA(), which called functionB(), which called functionC(). We have to wait until functionC() returns to functionB(), before returning to functionA(), which returns control to the OS. The Link Register (x30) is the special register which holds the address to return to, when a function returns. Every function call has an address to return to upon completion. Without a return address, it's like having someone ask you to do something, then forgetting who asked you once you completed it. Since we only have one set of registers, we also only have one Link Register. To save multiple return addresses, we must store them in memory (the Stack) with stp x29, x30, [sp, -16]!. If a new function is called and the Link Register is overwritten, we can still use ldr x29, x30, [sp], 16 to restore the correct return address, before calling ret to return.
Example 1: Let's say we write a simple program, which doesn't have any function calls or Stack Variables. Even if we don't store and load the x29 and x30 (known collectively as a Frame Record), the program will still run fine. This is because the return address stored in x30 is never altered.
Example 2: Now let's say we write a Hello World program. Main() has a return address stored in x30. If we don't store it first, before calling printf(), the return address in x30 will be overwritten with that of the printf() function. When printf() returns, there is no way to determine the original return address for Main(). Now if we call ret, the program will crash. This is also the reason we store the frame record at the beginning of the function, before any nested functions can be called.
Stack Trace and Stack Overflow (extra to the course)
This is just to help you connect the dots. Often times when your program crashes, you see what's called a Stack Trace. The Stack Trace shows you all the function calls that were made to reach the line of code causing the error. It's like a trail that leads you to the exact location of the error, showing you how your program reached that line. When you manually trace your code, you are doing something similar.
As for Stack Overflow, many of you have been using the website Stack Overflow without knowing what it means. Since a Stack is essentially memory space, there is also a limit to it. When you try to use more space than what's available (ie. accessing address space outside the limits), this is called a Stack Overflow.
Frame Record (x29 and x30, known as Frame Pointer and Link Register respectively)
stp x29, x30, [sp, -16]!
mov x29, sp
The Frame Record consists of x29 and x30, and is what you have been storing at the beginning of all your programs so far. While the Link Register was explained above, you'll notice we are also storing the Frame Pointer (x29). To understand the FP (Frame Pointer), we first look at SP (Stack Pointer).
Stack Pointer: The Stack Pointer (SP) always points to the top of the Stack. This is useful because free space is located above the Stack. If we want to call a function, we need to push a new Stack Frame to the Stack, meaning we need to allocate more memory space. Similarly, when we want to store variables to RAM (rather than to registers), we need to push them onto the Stack, thus again requiring more memory space. To allocate more space, we simply DECREMENT the SP. Remember, Stack Memory grows "backwards", or towards 0. Each x register is 64-bits, and 1 byte = 8 bits, hence each x register is 8 bytes. To store x29 and x30, we need 16 bytes. With the STP instruction, we PRE-INCREMENT the SP, then store the registers. Remember, the SP points to the top of our program's memory. If we simply pushed the registers, we would overwrite part of our program. Pre-incrementing the SP leaves us with 16 bytes of free memory between the old SP and the new SP. We can then store x29 and x30 safely.
Frame Pointer: The Frame Pointer (x29) is needed as an anchor, because SP can be moved many times when saving variables. Addresses are calculated using a base address + offset. We don't want the base address moving all the time, or we will never find our variables or the Frame Record. This is why we copy SP into FP (mov x29, sp) right after we allocate memory in the STP instruction. FP then stays the same for the duration of the function.
Link Register: The Link Register (x30) contains the return address for a function call, as mentioned above.
Allocating stack memory for local variables
Every time we declare a variable, we need more memory to store it in. In C, local variables are local to the block of code they are declared in. These blocks are specified by parenthesis {}. In assembly, we implement local variables as Stack Variables, meaning we save these variables to the Stack.
int main() { int a = 5, b = 7; <-- local to main() if (a < b) { int c; <-- local to if-construct c = 10; ... } ... } |
a_s = 16 b_s = 20 c_s = -4 ... main: stp x29, x30, [sp, -(16+8) & -16]! mov w19, 5 // init a to 5 |
Sample code taken from Prof. Manzara's slides. |
Variables local to the function: In the above code, notice how we only allocate memory for a and b when doing STP. These variables are available for the entire duration of a function, so the memory needs to be allocated at the beginning of the function. Since we allocate the memory BEFORE we move SP into FP (mov x29, sp), the offsets for a and b are positive. We often do this at the same time we store the Frame Record, using the STP instruction. To calculate how much memory we need, we use this formula:
alloc = -(16 + [memory needed for local/stack variables]) & -16
alloc is negative, because we need to decrement the SP (Stack uses high memory, and grows towards lower memory addresses).
We always require 16 bytes for the Frame Record, as explained above.
We then need to calculate the space for any local/stack variables.
Finally, we perform a bitwise AND with -16. The address stored in SP MUST be divisible by 16. Doing & -16 clears the last 4 bits of alloc, ensuring it is divisible by 16.
Example 1: We need 160 bytes for variables, -(16 + 160) & -16 = -176, so we allocate 176 bytes.
Example 2: We need 170 bytes for variables, -(16 + 170) & -16 = -192, so we allocate 192 bytes. The last unused 6 bytes are simply padded.
Variables local to blocks of code: These variables are allocated whenever they are needed. In the above code, c is only used inside the if-construct. We still allocate memory by decrementing the SP, but we don't use STP anymore. Instead we just add or sub the SP directly.
Because these variables are allocated AFTER we have moved SP into FP, the offsets from FP are negative.