Assembly
Machine Language
Machine language refers to processor instructions in binary
Assembly Language
Assembly language is a human-readable form of machine language using mnemonics (e.g ADD, LDR) instead of binary codes. Each processor family has its own assembly language
Different CPU architectures have different assembly languages. For example, ARM V7 Assembly, MIPS Assembly.
Assembler
An assembler compiles assembly instructions to machine instructions in a relocatable object file (*.o
).
Labels
In order to jump to different lines of the program, we would have to manually change the program counter to point to different lines. But if we add or remove lines, we would have to change these values. Instead, we can label different locations within the program with labels, and jump to those labels by name.
- Defining a label associates the name of the label with the a memory address
- Using a label inserts the memory address corresponding to the label in the machine language
For MIPS Assembly in Scala for CS241E:
val label = new Label("label")
Seq(
SLT(Reg(2), Reg(1), Reg (0)),
beq(Reg (2), Reg(0), label),
SUB(Reg (1), Reg(0), Reg (1)),
Define(label),
JR(Reg(31))
)
In CS241E MIPS Assembly in Scala:
A program that finds the absolute value:
SLT(Reg(2), Reg(1), Reg(0)) // Reg(0) is always the constant 0
BEQ(Reg(2), Reg(0), 1)
SUB(Reg(1), Reg(0), Reg(1))
JR(Reg(31)) // stop the program
But if we add a line after the branch, we need to update the number parameter. We can use labels instead:
SLT(Reg(2), Reg(1), Reg(0)) // Reg(0) is always the constant 0
BEQ(Reg(2), Reg(0), label)
SUB(Reg(1), Reg(0), Reg(1))
Define(label)
JR(Reg(31)) // stop the program
The assembler makes two passes to eliminate labels from the machine code. Two passes are necessary because the use of a label may occur before or after its definition (forward references).
- Pass 1:
- Determines address of each instruction
- Creates a symbol table
- A map from label names -> an address using an instruction location counter (ILC)
- Pass 2:
- Encode assembly instructions to machine code
- Replace labels with addresses/offsets based on symbol table
- Emit a relocatable object file (
*.o
)
ARM V7 Assembly example:
; | Code Size | ILC (instruction location counter)
AREA MyCode, CODE
__MAIN
MOV R0 ,#0 ; 4 | 0
ADR R1, ARRAY ; 2 | 4
LDR R2, N ; 2 | 6
LOOP LDR R3, [R1], #4 ; 4 | 8
ADD R0, R3 ; 2 | 12
SUBS R2, #1 ; 2 | 14
BGT LOOP ; 2 | 16
LDR R1, =SUM ; 2 | 18
STR R0, [R1] ; 2 | 20
B . ; 2 | 22
ALIGN
N
DCD 5 ; 4 | 24
ARRAY
DCD 1, 2, 3, 4, 5 ; 20 | 28
END
Symbol table for MyCode:
Label | Address |
---|---|
__MAIN |
0000 0000 |
LOOP |
0000 0008 |
N |
0000 0018 |
ARRAY |
10000 001C |
Pseudo-Instructions
A pseudo-instruction is assembled into to some real instructions at run time
ARM V7 Assembly examples:
LDR =
MOV32
CPY
NEG
Branch Instructions
Branch instructions (jump MIPS Assembly, branch in ARM V7 Assembly) adds an offset to the program counter (which is first incremented by 4)
Uses: Conditional Structures and Subroutines
Loaders
A loader has the following tasks:
- Copy machine instructions onto the test segment of memory
- Initialize data in the data segment of memory
- Initialize the the heap and stack
- Start the program