Skip to content
Sahithyan's S3
1
Sahithyan's S3 — Computer Architecture

Instruction Level Parallelism

The ability of a processor to execute multiple independent instructions simultaneously rather than strictly one after another. Degree to which instructions of a program can be overlapped in execution. Goal is to maximize CPI.

ILP comes from pipelining and multiple execution units.

2 approaches to exploit ILP:

  • Hardware-based dynamic approaches: Used in servers and desktop processors.
  • Compiler-based static approaches: Common in scientific applications, less successful outside this scope.

ILP is limited by data, name and control dependences.

Occurs when instruction jj uses a result produced by instruction ii. Transitive. Memory-based dependences are harder to detect.

Causes RAW hazards. Restricts reordering. Limits maximum ILP.

Instructions use same register/memory name but no actual data flow.

Types:

  • Anti-dependence (WAR): jj writes, ii reads.
  • Output dependence (WAW): Both write same name.

Solution is register renaming.

An instruction’s execution depends on the outcome of a branch. Instructions cannot be moved across branches.

Renaming registers with temporary ones. To avoid name dependences (antidependence and output dependence). Removes WAR and WAW hazards.

Can be done dynamically or statically.

Use a physical register file:

  • Many more physical registers than architectural ones.
  • Map table updated on commit.
  • Old physical registers freed later.

A computer architecture designed to achieve CPI less than 1.

Compiler schedules.

One long instruction with many operations. High throughput.

But:

  • Only useful if there is enough ILP in code to fill available slots.
  • Difficult to find parallelism statically
  • Code size growth
  • No hazard detection hardware
  • Poor binary compatibility

Hardware schedules and speculation.