Expressions
Example: a * b + c * d
We want to change this into machine language.
Diagram
First, we parse it into a tree
How should we evaluate this?
Techniques for Evaluating Expressions
As a convention, we will store the computed value in register 3, with the Scala alias Reg.result
.
Technique #1: Using a Stack
See The Stack
Implementation
- Evaluate subexpression e1
- Push result to stack
- Evaluate subexpression e2
- Pop top of stack into a scratch register
Result register has value of e1, scratch register has value of e2 - Apply operator to result and scratch registers, store result in result register
Pseudocode:
/**
* Generate assembly language code to evaluate a binary
* expression "e1 operator e2", where e1 and e2 are
* subexpressions and operator is some arithmetic operator,
* and place the resulting value in Reg.result.
*/
def generateCode(e1, operator, e2): Code = block(
generateCode(e1.e1),
push(Reg(3)),
generateCode(e2.e1),
pop(Reg(4)),
Reg(3) := Reg(4) operator Reg(3)
)
Execution trace pseudocode for example:
# generateCode (a * b + c * d)
# generateCode(a * b)
read a into $3
push $3
read b into $3
pop $4
$3 = $4 * $3
push $3
# generateCode(c * d)
read c $3
push $3
read d $3
pop $4
$3 = $4 * $3
pop $4
$3 = $3 + $4
Properties
- Upsides:
- Can generate correct code for expressions of arbitrary depth
- Downsides:
- Inefficient, many pushes and pops are slow
- Hard to optimize for both compiler and human
Technique #2: Using Variables (Virtual Registers)
See Variables
We can evaluate the expression as follows:
Implementation
- Evaluate subexpression e1, store result in a variable
- Evaluate subexpression e2, store result in a variable
- Apply operator to the two variables, store result in a variable
Pseudocode:
/**
* Generate assembly language code to evaluate a binary expression
* e1 operator e2, where e1 and e2 are subexpressions and operator
* is some arithmetic operator, and place the resulting value
* in a variable .
*/
def generateCode(e1, operator, e2): (Code, Variable) = {
(c1, v1) = generateCode(e1)
(c2, v2) = generateCode(e2)
v3 = new Variable()
code = block(c1, c2, v3 = v1 operator v2)
(code , v3)
}
Execution trace pseudocode for example:
# generateCode(a * b + c * d)
# generateCode(a * b)
v1 = a * b
# generateCode(c * d)
v2 = c * d
v3 = v1 + v2
Properties
- Upsides:
- Generates code that's easier to optimize
- Downsides:
- Machine language arithmetic operates on registers, not variables, so we need multiple instructions to express expressions such as
a * b
- We need a way to map variables to registers rather than stack locations, and to map a large number of variables to a limited number of registers (which we could alleviate with Register Allocation)
- Machine language arithmetic operates on registers, not variables, so we need multiple instructions to express expressions such as
Technique #3: Variables for Temporary Values, Arithmetic Operations on Registers
Implementation
- Evaluate subexpression 1
- Write result to a variable,
t1
- Evaluate subexpression 2
- Read variable
t1
into scratch register
Result register has value of e2, scratch register has value of e1 - Apply operator to result and scratch registers, store result in result register
Pseudocode:
/**
* Generate assembly language code to evaluate a binary
* expression "e1 operator e2", where e1 and e2 are
* subexpressions and operator is some arithmetic operator,
* and place the resulting value in Reg.result.
*/
def generateCode (e1, operator, e2): Code = {
t1 = new Variable()
block(
generateCode(e1),
write(t1, Reg(3)),
generateCode(e2),
read(Reg(4), t1),
Reg(3) := Reg(4) op Reg(3)
)
}
Execution trace pseudocode for example:
# generateCode(a * b + c * d)
# generateCode(a * b)
read a into $3
write $3 into var t2
read b into $3
read var t2 into $4
$3 = $3 * $4
write $3 into var t1
# generateCode(c * d)
read c into $3
write $3 into var t3
read d into $3
read var t3 into $4
$3 = $3 * $4
read var t1 into $4
$3 = $3 + $4
- Note that this is almost just as inefficient as technique 1. It is easier to understand and optimize further, however
- Compared to technique 2, the code does not need a register allocator
A real compiler would use technique 2 with a register allocator. However, for CS241E, we want to avoid that, so we will use this technique.