Skip to content
Sahithyan's S3
1
Sahithyan's S3 — Computer Architecture

RISC-V Architecture

An open-standard ISA based on established RISC principles. Open source. Modular by design. Denoted as RV32I or RV64I. The number represent the bit width of the registers and the address bus. Only 32-bit version is discussed in this module.

  • Modularity
    RISC-V has a small base integer instruction set (RV32I/RV64I) with optional extensions.
  • Simplicity
    Clean design with fixed-length 32-bit instructions (in the base encoding).
  • Scalability
    Supports various implementation sizes from embedded microcontrollers to high-performance computing.

RISC-V uses a load-store architecture where only a small set of load and store instructions can access memory. All other operations work directly on the registers.

The base RISC-V architecture is designed to be minimal and easy to implement. As not all implementations require integer multiplication, division and other functionality, they are extracted into extensions.

ExtensionDescriptionInstructions Count
IBase51
MInteger multiplication and division13
AAtomic operations22
FSingle-precision floating-point30
DDouble-precision floating-point32
CCompressed instructions (16-bit)36
GSame as MAFD97
BBit manipulation?
NUser level interrupts?

Proprietary extensions can also be developed.

Eventhough multiplication and division are not-so-complex from mathematical view, they require more silicon area, and thus more power. That’s why the M extension is made.

The misa register can be used to query which extensions are implemented in the processor.

Has 32 registers. Each register is numbered from x0 to x31. Each one also has an assigned name.

RegisterABI NameDescription
x0zeroHardwired to 0
x1raReturn address
x2spStack pointer
x3gpGlobal pointer
x4tpThread pointer
x5-x7t0-t2Temporaries
x8s0/fpSaved register/frame pointer
x9s1Saved register
x10-x11a0-a1Function arguments/return values
x12-x17a2-a7Function arguments
x18-x27s2-s11Saved registers
x28-x31t3-t6Temporaries

All instructions are 32 bits wide. All the instructions are encoded in different formats.

Instruction types

Register-register operations. Input is 3 registers.

The instruction format:

SectionWidth (bits)Description
funct77 [31:25]Operation subtype (e.g., ADD vs SUB)
rs25 [24:20]Second source register
rs15 [19:15]First source register
funct33 [14:12]Instruction variant
rd5 [11:7]Destination register
opcode7 [6:0]Operation code

Immediate operations. Input is 2 registers and an immediate value. Immediate value is 12 bits wide, sign-extended.

SectionWidth (bits)Description
imm[11:0]12 [31:20]Immediate value (signed)
rs15 [19:15] Source register
funct33 [14:12] Instruction variant
rd5 [11:7] Destination register
opcode7 [6:0] Operation code

Store operations, from register to memory.

The instruction format:

SectionWidth (bits)Description
imm[11:5]7 [31:25] Offset bits [11-5]
rs25 [24:20] Register containing the target value
rs15 [19:15] Register containing the base address
funct33 [14:12] Instruction variant
imm[4:0]5 [11:7] Offset bits [4-0]
opcode7 [6:0] Operation code

Immediate value is sign-extended. Used as the offset to the target memory address. Split into 2 parts to make sure the instruction format is aligned as much as possible with other instructions.

As both the instruction size and the address bus size are 32 bits wide, the target memory address cannot be included in 1 instruction. That is why this method is being used. This method also provides the local usage functionality.

Conditional branch operations. Similar to jumps, but for temporary and short-range jumps.

SectionWidth (bits)Description
imm[12]1 [31] MSB of offset (sign bit)
imm[10:5]6 [30:25] Offset bits [10:5]
rs25 [24:20] Second source register
rs15 [19:15] First source register
funct33 [14:12] Instruction variant
imm[4:1]4 [11:8] Offset bits [4:1]
imm[11]1 [7] Another bit of offset
opcode7 [6:0] Operation code

Memory is byte-addressable. Instructions can only start from addresses that are multiples of 4 (otherwise it would be complex and slower). As branch targets must point to instructions, the target memory address must be aligned to a multiple of 4, which means the 2 LSBs are always 0. When compressed instructions are used, the target memory address must be aligned to a multiple of 2.

The exact range branchable range is:

[212,2122]\left[-2^{12},\,2^{12} - 2\right]

To improve the branch range, the LSB, which is always 0, is not stored in the instruction. When actually executing the instruction, the target address is left shifted by 1 bit.

Upper immediate operations.

The instruction format:

SectionWidth (bits)Description
imm[31:12]20 [31:12] 20-bit immediate value
rd5 [11:7] Destination register
opcode7 [6:0] Operation code

Handle large immediate values (20 bits). Used for LUI (Load Upper Immediate) to set the upper 20 bits of a register and AUIPC (Add Upper Immediate to PC) for PC-relative addressing. These instructions enable efficient address calculation and large constant loading.

Used for unconditional jump operations. Similar to branch operations, but for long range. Has an unintuitive instruction format for better performance.

The instruction format:

SectionWidth (bits)Description
imm[20]1 [31] MSB of offset (sign bit)
imm[10:1]10 [30:21] Midle bits of offset
imm[11]1 [20] Another bit of offset
imm[19:12]8 [19:12] Upper bits of offset
rd5 [11:7] Destination register
opcode7 [6:0] Operation code

Allows 20-bit signed offsets, allowing jumps to targets further away than B-type branches. The exact range is:

[221,2212]\left[2^{-21},\,2^{21} - 2\right]

The immediate bits are arranged in a non-sequential order to simplify hardware implementation. The most significant bit is placed at bit 31 for efficient sign extension, and the remaining bits are organized to maximize compatibility with other instruction formats.

Like B-type instructions, the least significant bit of the target address is omitted as it must be 0 (instructions are aligned on even byte boundaries when using compressed instructions, or 4-byte boundaries otherwise). The jump offset is sign-extended and added to the PC to form the jump target address.

Instructions of RISC-V are designed to follow maximum commanility.

  • Fixed size instructions
    This keeps fetch and decode simple. Hardware can slice instructions in constant widths, improving pipeline timing. (Compressed 16-bit instructions exist, but the base ISA remains fixed-width.)
  • Opcode is always last 7 bits.
  • When required, rs1, rs2 and rd are always placed at the same positions.
  • If rd is not required, a subset of imm is stored there.
  • When imm is required, its MSB is stored in the instruction’s MSB.
    Reason: to improve performance in sign extension
  • No implicit condition codes
    Instructions don’t set hidden flags (like x86’s CF, ZF). All comparisons produce explicit results → simpler out-of-order execution.
  • No privileged instructions in the base ISA
    User-level ISA is fully separated from privileged instructions (CSR, traps, paging). A clean split makes OS design clearer.
  • 32 registers, all general-purpose No “special” registers except x0 = hardwired zero. Simplifies compiler register allocation and reduces instruction count.
  • Load–store architecture Only loads and stores access memory. All arithmetic uses registers → easier pipeline, fewer hazards.
  • Simple memory addressing modes Only base + offset. No complex scaling or indirection. Hardware addressing logic is tiny and fast.

RISC-V instructions can be executed in at most 5 clock cycles. RISC-V has a 5-stage pipeline.

Instruction TypeRequired CCUsed Stages
B3IF, ID, EX
S4IF, ID, EX, MEM
R, I, U, J5IF, ID, EX, MEM, WB

Current instruction pointed by the PC is fetched from memory. Update the PC to the next sequential instruction by adding 4 bytes.

Both of the operations are performed in parallel:

  • Decode the instruction
  • Read the specified registers
    Equality test on registers are done as they are read. The register might be unused, but it doesn’t hurt performance. Power is wasted though. Power-sensitive designs might avoid this.
  • Offset is sign-extended
    The immediate field is always in the same place, so sign-extension is straightforward. Possible branch target is computed by adding the sign-extended offset to the incremented PC.

Decoding is done in parallel with reading registers. It’s possible because the register specifiers are at a fixed location in a RISC architecture. This tech- nique is known as fixed-field decoding.

Only one of the following tasks is performed:

  • Effective address is calculated (base register + offset) OR
  • Other ALU operation is performed based on the instruction type
    • For Register-Register ALU instruction
      The ALU performs the operation specified by the ALU opcode on the values read from the register file.
    • For Register-Immediate ALU instruction
      The ALU performs the operation specified by the ALU opcode on the first value read from the register file and the sign-extended immediate.
    • For Conditional branch instruction
      Determine whether the condition is true.

The reason is RISC-V is load-store architecture. No instruction needs to simultaneously calculate a data address and perform an operation on the data.

Required only for S instructions. Dummy stage for R, I, U and J instructions. Skipped for B instructions.

  • For a load: the memory does a read using the effective address computed in the previous cycle
  • For a store: the memory writes the data from the second register

Skipped for B instructions because:

  • No memory access and no write back is required
  • PC updates cannot be delayed

J instructions also has to update PC, but they have to write to ra (WB stage). Because of that they go through MEM stage.

Write result back to destination register. Skipped for S and B instruction.

  • For an ALU instruction The ALU’s output is written.
  • For a load instruction The data fetched from memory in MEM stage is written.
  • For jump instruction
    Return address is stored in the ra register.