A technique to improve CPU performance. Execution of multiple instructions are overlapped the execution of multiple instructions, instead of processing each instruction sequentially from start to finish.
The CPU divides instruction execution into several stages. Each stage is handled by a different part of the processor, allowing multiple instructions to be in different stages of execution at the same time.
Increases instruction throughput. Slightly increases the execution time of each instruction due to increased overhead (caused by pipeline register delays, clock skew). Introduces challenges (hazards) which must be managed to maintain correct program execution.
Terminology
Section titled “Terminology”Clock skew
Section titled “Clock skew”The difference in arrival time on the clock signal at different parts of the circuit.
Caused by unequal wire lengths, capacitance, and propagation delays. Causes pipeline errors.
The sequence of stages through which instructions pass during execution in a pipelined processor.
Instructions move from one stage to the next in a streamlined fashion. The pipe enables multiple instructions to be processed simultaneously, with each instruction at a different stage, thereby increasing overall throughput and efficiency.
Pipe stage
Section titled “Pipe stage”Aka. pipe segment. A specific part of the pipeline that performs a particular function.
Depth of pipeline
Section titled “Depth of pipeline”The number of stages in the pipeline.
Balanced pipeline
Section titled “Balanced pipeline”A pipeline setup where all stages have the same duration.
Throughput
Section titled “Throughput”Average number of instructions coming out of the pipe per unit time.
Processor cycle
Section titled “Processor cycle”Time required between moving an instruction one step down the pipeline. Depends on slowest stage.
Clock cycle cannot be smaller than the sum of clock skew and latch overhead.
Time per instruction
Section titled “Time per instruction”Denoted by TPI.
Here is the number of pipeline stages.
Speedup
Section titled “Speedup”For a balanced pipeline:
Pipeline register
Section titled “Pipeline register”A register placed between two stages of a CPU pipeline.
Pipeline Stall
Section titled “Pipeline Stall”When an instruction need delaying during a hazard. One stalled instruction causes all instructions after it to stall. Instructions issued earlier than the stalled one must continue to clear the stall. No new instructions fetched during stall. Causes performance degradation.
Branch Penalty
Section titled “Branch Penalty”Number of wasted clock cycles wasted due to a branch instruction.
When pipeline is stalled because of a branch instruction. The CPU has to either:
- Wait until the branch decision is resolved before fetching the next instruction.
- Predict the branch outcome and fetch the next instruction based on the prediction.
If the prediction is correct, the pipeline proceeds without any issues. If it’s incorrect, the incorrect instruction must be flushed. The pipeline has to start again with the correct instruction. This wastes several clock cycles.
In either cases, the number of lost cycles is called the branch penalty.