Assembly

Machine Language

Definition

Machine language refers to processor instructions in binary

Assembly Language

Info

Assembly language is a human-readable form of machine language using mnemonics (e.g ADD, LDR) instead of binary codes. Each processor family has its own assembly language

Different CPU architectures have different assembly languages. For example, ARM V7 Assembly, MIPS Assembly.

Assembler

Definition

An assembler compiles assembly instructions to machine instructions in a relocatable object file (*.o).

Note

Assembler directives are not translated to machine instructions (example)

  • E.g ALIGN starts the following instruction/data at a word-aligned address
  • AREA creates a named memory region
  • DCB, DCW, DCD declare byte/half word/word data within the code stack

Labels

Info

In order to jump to different lines of the program, we would have to manually change the program counter to point to different lines. But if we add or remove lines, we would have to change these values. Instead, we can label different locations within the program with labels, and jump to those labels by name.

Example

For MIPS Assembly in Scala for CS241E:

val label = new Label("label")
Seq(
    SLT(Reg(2), Reg(1), Reg (0)),
    beq(Reg (2), Reg(0), label),
    SUB(Reg (1), Reg(0), Reg (1)),
    Define(label),
    JR(Reg(31))
)
Example

In CS241E MIPS Assembly in Scala:

A program that finds the absolute value:

SLT(Reg(2), Reg(1), Reg(0)) // Reg(0) is always the constant 0
BEQ(Reg(2), Reg(0), 1)
SUB(Reg(1), Reg(0), Reg(1))
JR(Reg(31))                 // stop the program

But if we add a line after the branch, we need to update the number parameter. We can use labels instead:

SLT(Reg(2), Reg(1), Reg(0)) // Reg(0) is always the constant 0
BEQ(Reg(2), Reg(0), label)
SUB(Reg(1), Reg(0), Reg(1))
Define(label)
JR(Reg(31))                 // stop the program
Implementation

The assembler makes two passes to eliminate labels from the machine code. Two passes are necessary because the use of a label may occur before or after its definition (forward references).

  • Pass 1:
    • Determines address of each instruction
    • Creates a symbol table
      • A map from label names -> an address using an instruction location counter (ILC)
  • Pass 2:
    • Encode assembly instructions to machine code
    • Replace labels with addresses/offsets based on symbol table
    • Emit a relocatable object file (*.o)
Example

ARM V7 Assembly example:

; | Code               Size | ILC (instruction location counter)
    AREA MyCode, CODE
__MAIN
    MOV R0 ,#0        ; 4    | 0
    ADR R1, ARRAY     ; 2    | 4
    LDR R2, N         ; 2    | 6
LOOP LDR R3, [R1], #4 ; 4    | 8
    ADD R0, R3        ; 2    | 12
    SUBS R2, #1       ; 2    | 14
    BGT LOOP          ; 2    | 16
    LDR R1, =SUM      ; 2    | 18
    STR R0, [R1]      ; 2    | 20
    B .               ; 2    | 22
    ALIGN
N
    DCD 5             ; 4    | 24
ARRAY
    DCD 1, 2, 3, 4, 5 ; 20   | 28
    END

Symbol table for MyCode:

Label Address
__MAIN 0000 0000
LOOP 0000 0008
N 0000 0018
ARRAY 10000 001C

Pseudo-Instructions

Definition

A pseudo-instruction is assembled into to some real instructions at run time

Examples

ARM V7 Assembly examples:

  • LDR =
  • MOV32
  • CPY
  • NEG

Branch Instructions

Info

Branch instructions (jump MIPS Assembly, branch in ARM V7 Assembly) adds an offset to the program counter (which is first incremented by 4)

Uses: Conditional Structures and Subroutines

Loaders

Definition

A loader has the following tasks:

  • Copy machine instructions onto the test segment of memory
  • Initialize data in the data segment of memory
  • Initialize the the heap and stack
  • Start the program