Pipelining
Example
Analogy: doing laundry.
There are 4 Steps to doing 1 load of laundry:
- Wash
- Dry
- Fold
- Put away
How should we do 3 loads of laundry?
Since we don't need the washing machine for any steps after than step 1, we can just use it again. The same principle applies to the other stages.
Observations:
- Time (latency) for one load remains the same
- Overlapping increases throughput (but not instruction speed)
- If you can sustain the pipeline, you can eventually do 1 load per cycle
Non-pipelined (single-cycle) processor
Diagram
$$t_{\text{cycle}} = 0.8 \mathrm{ns}, CPI = 1$$
Pipelined Processor
Diagram
When the pipeline is full: $$t_{\text{cycle}} = 0.2 \mathrm{ns}, CPI = 1$$
Remark
Filling Time
Equation
where
Speedup
Speedup (IC = 3): using the diagram, we can calculate
If only given numbers:
Because the pipeline was filling in cycles 1-4, we don't exactly have
Speedup if