Skip to content
Sahithyan's S3
1
Sahithyan's S3 — Computer Architecture

Introduction to Computer Architecture

Computer architecture is the design and organization of a computer system. It involves the conceptual structure and functional behavior of a computer system at a level that can be used to understand how the computer works.

Can be categorized into 2 based on cycles per instructions (CPI).

Each instruction executes completely in 1 clock cycle. Slowest instruction defines the clock cycle’s duration. Simpler instructions causes waste of time, as they finish earlier. Not scalable as introducing more or complex instructions causes longer clock cycles.

Different instructions take different number of clock cycles to execute. Instructions are split into multiple stages, each stage taking one clock cycle.

Slowest step of all the instructions defines the clock cycle’s duration. Wasted time due to simple instruction waiting is less compared to single cycle CPU.

Allows for more complex instructions to be executed in parallel.

A microprocessor is a CPU on a single chip. Has a larger instruction set. Has more registers.

A microcontroller is a small computer on a single chip. Includes a CPU, ROM, SRAM and other peripherals. Self-contained. Consumes less power. Has small memory.

Refers to the structural design of how a computer’s major components (CPU(s), memory, I/O devices) are organized and interact.

Single CPU. Only one instruction stream runs at any instant. Uses multiprogramming and preemptive scheduling. Most common design.

Multiple physical CPUs sharing memory and I/O. All processors work under one OS image.

Increases throughput and reliability. One CPU can fail without stopping the system.

All processors has the same responsibilities and power; share the same memory and I/O resources. Scalable. Common in modern OS design.

Different types of processors, designed for specific tasks are used. Improves performance. Not as much scalable.

Multiple cores inside a single chip. A core is an independent execution engine. Has its own ALU, pipelines, registers, and can run one instruction stream at a time. Cores share some resources such as caches, buses.

Gives parallelism without multiple physical chips. Faster. Less power consumption.

Multicore system where a core contains multiple hardware threads (aka. logical processors). A virtual execution slot inside a core. Shares core’s execution units but keep separate register sets. Switches to another thread when a memory stall occurs. Not to be confused with OS-level thread.

NUMA stands for Non-Uniform Memory Access. A multiprocessor layout where memory is split into local regions per core. OS must be NUMA-aware to place threads and data close together.

Memory access cost depends on which CPU accesses it. Access to local memory is fast; access to remote memory is slower.

Multiple separate computers connected by a high-speed network and managed to act as a single service. Used for high availability (failover) or high performance (parallel computation). Each node runs its own OS, but clustering software coordinates them.

Multiple systems working together for high availability or high performance.