EC

  • Edwin Chan
  • CV
  • CPSC 233
  • CPSC 331
  • CPSC 355
  • CPSC 581
  • Origami
  • Random

Tutorial 7 (Nov 14): External variables

[external1.s][1085B] Example 1: Declaring and accessing global variables (example provided by Prof. Manzara)
[external2.s][2163B] Example 2: Declaring and accessing global arrays, and using .bss

External Variables are used in assembly to implement C language global and static local variables. Previously, we used registers or local variables to store data. Let's compare all of them to understand them better.

Registers

Registers are located directly on the CPU, therefore there is less latency. In other words, they can be accessed very quickly. However, in general, the closer the memory is to the CPU, the more expensive it is. As such, we tend to reserve registers for heavily used data/variables.

Local variables

Since registers are so limited, we cannot store large arrays or structs in the registers. Instead, we allocate stack memory (RAM) for local variables. The stack uses high memory, and each closed subroutine call has its own stack frame. The "local" part represents the scope, as each local variable is only available to the block of code it is allocated in. In A3, we allocated i, j, and array V[] using STP, which meant the variables were allocated at the start of the subroutine. The memory gets deallocated at the end of the subroutine with LDP, meaning the variable is available throughout the subroutine. In A3, we also allocated temp, in the middle of the subroutine. Until we allocated temp, it was not available to the subroutine. It was also unavailable to the code after we deallocated temp, therefore temp was local to only part of the code inside the subroutine.

External variables

Sometimes we need to share information between function calls (static/class variables) or between multiple files (global variables). The common link between these is the persistence of data outside of individual function calls. Variables/data declared in the .data section are available to the entire file, and resemble static global variables. By using the .global directive, we can make these variables global variables, available to other compilation units such as other files or C code.

  Local Variables External Variables
Memory
Allocation
decrement SP
(middle of subroutine)
STP 
(start of subroutine)
.data/.bss
(without .global)
.data/.bss
(with .global)
Scope code block subroutine file program
Lifetime code block subroutine program program

.text, .data, .bss sections

In A1-A4, we didn't specify the sections for our code. The default section is the .text section, which contains read-only data. This includes our program instructions, as well as read-only data such as constants and string literals. The .data section read/write data, and can be programmer-initialized with pseudo-ops such as .word. It can also contain unitialized data, by using the .skip pseudo-op. Finally, the .bss section contains non-programmer initialized memory, and generally only uses the .skip pseudo-op. All memory allocated in .bss is zeroed before program execution.

Pseudo-ops

This isn't anything new, since you have already been declaring strings literals for printf(), using fmt: .string "myString Format".

         .data
a_m:     .byte     10                                    //  1 byte  = 8 bits = number of bits to encode a single character (ASCII)
b_m:     .hword    20                                    //  2 bytes = 16 bits
c_m:     .word     30                                    //  4 bytes = 32 bits = int
d_m:     .dword    40                                    //  8 bytes = 64 bits
arraya_m .skip     5*4                                   // 20 bytes (5 * 4) of uninitialized memory
arrayb_m:.word     10, 20, 30, 40, 50                    // 20 bytes (array of 5 words/ints * 4 bytes each)
arrayc_m:.dword    10, 20, 30, 40, 50                    // 40 bytes (array of 5 dwords     * 4 bytes each)
sa_m:    .string   "this string is null-terminated"      // .string automatically adds a 0 byte (.byte 0) to terminate the string
sb_m:    .asciz    "this string is null terminated too"  // .asciz is the same as .string
sc_m:    .ascii    "this string is not null-terminated"  // .ascii does not add a 0 byte to the end, and is not mull-terminated
char_m:  .byte     'a'                                   // ASCII characters are encoded using 7 bits, and can be stored in a byte (8 bits)
chars_m: .byte     'h', 'e', 'l', 'l', 'o'               // a string is usually comprised of a char[] array

.global directive

To make a block of code available to other files or compilation units, we need to use the .global directive. This is also something you already know how to do. The format is usually:

.global <label_name> // the whole block, between the specified label and the next label, becomes global

Remember this?

      .global main
main: stp x29, x30, [sp, -16]!
      mov x29, sp
      ...

We used .global to make the main() subroutine global, so that we can call it from the OS. If you forgot to include .global main, you would get this error from GCC compiler:

undefined reference to `main'

Since we label all our global variables (using the labels as names), we can do the same thing to make them global:

         .data
         .global a_m                  // global int = 10
a_m:     .word   10
         .global array_m              // global int[5] = [10, 20, 30, 40, 50]
array_m: .word   10, 20, 30, 40, 50
         .global empty_m              // global int[5], not initialized
empty_m: .skip   5*4

Accessing (store/load) external variables

Believe it or not, you also know how to do this already. Labels point to the beginning of an address location. Given an address, you can load a number of bytes from it. Accesing external variables combines two things you know:

  1. Getting the address from a label.
  2. Loading from an address, using LDR and offsets.

          .data
index_m:  .word  0
array_m:  .word  10, 20, 30, 40, 50

          .text
          .global main
main:     ...
          adrp x28, index_m             // use adrp to get the base address of index_m
          add x28, x28, :lo12:index_m   // still need to add the lower 12-bits of index_m's address
          ldr w19, [x28]                // using x28 as a pointer, load the value of index
          add w19, w19, 1               // modify index
          str w19, [x28]                // store index back to its address

          adrp x27, array_m             // use adrp to get the base address of array_m
          add x27, x27, :lo12:array_m   // still need to add the lower 12-bits of array_m's address
          ldr w20, [x27, w19, SXTW 2]   // x27 as base address, index*4 as offset
          add w20, w20, 1               // modify value at array[index]
          str w20, [x27, w19, SXTW 2]   // store value back to its position in the array

Timber by EMSIEN 3 Ltd BG
  • Edwin Chan
  • CV
  • CPSC 233
  • CPSC 331
  • CPSC 355
  • CPSC 581
  • Origami
  • Random