Pipelining Hazards

Definition

Hazards are conditions that will lead to incorrect results if not mitigated

Mitigation strategies: add hardware, stall the pipeline or fix the software (e.g. add no-ops)

Hazard types: data, control, structural

Stalling

Definition

Stalling the pipeline means to halt an instruction, not allowing it to progress.

Steps

To stall an instruction in ID:

  • Disable writes to the PC and IF/ID pipeline register
  • Set control signals in ID/EX to 0 (inserts a no-op into the pipeline)

Data Hazards

Example

ADD  X2, X0, X1
ADDI X3, X2, #1

Hazard: First instruction writes to X2, second instruction uses it, but we expect that it has been updated at this point

Solution: We need a 2 cycle stall

We assume something called forwarding through the register file (i.e it writes in first half and reads in second half of the clock cycle)

Observation: the result of X0 + X1 is present in the pipeline after clock cycle 3 (and this matters because that means we can use it earlier instead of stalling 2 cycles, see #Data Forwarding)

Data Forwarding

Definition

Data forwarding allows us to access data as soon as it's ready, so we don't have to wait for writeback to complete.

Example

ADD  X2, X0, X1
ADDI X3, X2, #1
ADD  X4, X5, X2

Now instruction 3 also uses X2. Should we stall the pipeline again? That would defeat the point of pipelining, we might as well just use a single-cycle processor. Instead, we can forward the data.

Info

The forwarding unit controls the ALU input (A and B) mux selects. The logic for A is as follows:

// If the current instruction writes to a register and we use that register in the next instruction
if (EX/ME.RegWrite && EX/ME.Rd == ID/EX.Rn) {
    ForwardA = 2 // Forward from ME
} else if (ME/WB.RegWrite && ME/WB.Rd == ID/EX.Rn) {
    ForwardA = 1 // Forward from WB
} else {
    ForwardA = 0 // No forwarding
}

Load Hazards

Definition

A load hazard is a special case of a data hazard: the value is produced in ME instead of EX. The following dependent instruction must stall for 1 cycle (even with forwarding).

Example

MOV X1, #3
LDR X2, [X0, #8]
ADD X3, X1, X2

We need to stall instruction 3 (and consequently any proceeding instructions). After 1 cycle, the load hazard is gone and the stall signal is de-asserted.

Alternatively, instead of stalling, a compiler can schedule (re-order) instructions to avoid a load hazard: ```armasm LDR X2, [X0, #8] MOV X1, #3 ADD X3, X1, X2 ```
Info

Logic to detect load hazards while in ID:

if (ID/EX.MemRead && (ID/EX.Rd == IF/ID.Rn || ID/EX.Rd = IF/ID.Rm)) {
    stall = 1
} else {
    stall = 0
}

in other words, if the instruction in EX is LDR (only LDR has MemRead control signal) and the destination register of the instruction in EX is equal to any of the src registers of instructions in ID, the stall.

Control Hazards

Definition

Control hazards are caused by branch instructions. They are an issue because we are branching to new code while there are other instructions lined up in the pipeline.

Diagram

Info

The branch penalty can be reduced to 1 cycle by:

  1. Moving the branch adder from EX to ID and connecting its output to the PC mux
  2. Conditional branches also require forwarding paths from ME and WB to ID for checking the branch condition
Diagram

Conditional Branching

Info

If a conditional branch depends on a preceding:

  1. ALU instr, it must stall 1 cycle
  2. LDR instr, it must stall 2 cycles
Example

Case 1: conditional branch dependent on preceding ADD

ADD X1, X2, X3
CBZ X1, label
ORR X4, X5, X6

Note: if the CBZ is taken, the ORR will be squashed

Impact on CPI (cycles per instruction)

Calculating the branch penalty:

Conditional Branch Prediction

Steps

  1. Predict the branch decision (taken or not taken) and the branch target address
  2. Speculatively execute the predicted path
  3. Execute the branch to check the prediction and recover if wrong (squash the mispredicted instructions and start fetching correct instructions)

Static prediction (without runtime history):

Dynamic Prediction (with runtime history):

if (instr.address == PC && predictor == T) {
    PC <- targetAddress
} else {
    PC <- PC + 4
}

Structural Hazards

Solution: add more hardware!