Skip to content
Sahithyan's S3
1
Sahithyan's S3 — Computer Architecture

Pipelining

A technique to improve CPU performance. Execution of multiple instructions are overlapped the execution of multiple instructions, instead of processing each instruction sequentially from start to finish.

The CPU divides instruction execution into several stages. Each stage is handled by a different part of the processor, allowing multiple instructions to be in different stages of execution at the same time.

Increases instruction throughput. Slightly increases the execution time of each instruction due to increased overhead (caused by pipeline register delays, clock skew). Introduces challenges (hazards) which must be managed to maintain correct program execution.

The difference in arrival time on the clock signal at different parts of the circuit.

Caused by unequal wire lengths, capacitance, and propagation delays. Causes pipeline errors.

The sequence of stages through which instructions pass during execution in a pipelined processor.

Instructions move from one stage to the next in a streamlined fashion. The pipe enables multiple instructions to be processed simultaneously, with each instruction at a different stage, thereby increasing overall throughput and efficiency.

Aka. pipe segment. A specific part of the pipeline that performs a particular function.

The number of stages in the pipeline.

A pipeline setup where all stages have the same duration.

Average number of instructions coming out of the pipe per unit time.

Time required between moving an instruction one step down the pipeline. Depends on slowest stage.

Clock cycle cannot be smaller than the sum of clock skew and latch overhead.

Denoted by TPI.

TPIpipelined=TPIunpipelinedN\text{TPI}_{\text{pipelined}} = \frac{\text{TPI}_{\text{unpipelined}}}{\text{N}}

Here NN is the number of pipeline stages.

Speedup=TPIunpipelinedTPIpipelined\text{Speedup} = \frac{\text{TPI}_{\text{unpipelined}}}{\text{TPI}_{\text{pipelined}}}

For a balanced pipeline:

Speedup=Number of pipeline stages\text{Speedup} = \text{Number of pipeline stages}

A register placed between two stages of a CPU pipeline.

When an instruction need delaying during a hazard. One stalled instruction causes all instructions after it to stall. Instructions issued earlier than the stalled one must continue to clear the stall. No new instructions fetched during stall. Causes performance degradation.

Number of wasted clock cycles wasted due to a branch instruction.

When pipeline is stalled because of a branch instruction. The CPU has to either:

  • Wait until the branch decision is resolved before fetching the next instruction.
  • Predict the branch outcome and fetch the next instruction based on the prediction.

If the prediction is correct, the pipeline proceeds without any issues. If it’s incorrect, the incorrect instruction must be flushed. The pipeline has to start again with the correct instruction. This wastes several clock cycles.

In either cases, the number of lost cycles is called the branch penalty.