Instructions
Table of contents
- Introduction
- Registers
- RISC-V Instruction format
- Computational instructions
- Control flow instructions
- Data transfer instructions
- Memory layout
- Procedures
- References
Introduction
Instructions are the primitive operations that a computer can perform.
An ISA (Instruction Set Architecture) is an abstract model of a computer that is used to create physical implementations (e.g. CPUs). x86 and ARMv8 are popular ISAs.
ISAs usually include arithmetic/logic, data transfer, and branching instructions. ISAs also define storage interfaces (i.e. registers and memory). Despite some differences in syntax and operations, ISAs tend to be similar overall.
CISC architectures attempt to minimize the number of instructions required for a program by offering a larger number of primitive instructions. x86 is an example of a CISC ISA.
RISC architectures reduces the number of instructions with the intention of improving the efficiency of the processor. ARMv8 and RISC-V are examples of a RISC instruction set.
This section uses the RV64 variant of the RISC-V ISA.
An assembler converts a symbolic version of an instruction into the binary version. e.g.: add A,B to . The language is known as assembly language. The binary version is known as machine language.
Registers
A register is a hardware component that can store a value.
A word is the name given to a unit of access in a computer (normally 32 bits) and a doubleword is a larger unit of access in a computer (normally 64 bits) [1, P. 173].
The size of registers depends on the ISA, but they usually store either a word or a doubleword.
In RV64 there are 32 64-bit registers. Registers are 0-indexed and are referred to as xN where N is the register number, e.g. x1 or x31.
RISC-V registers are assigned purposes by convention:
Name | Usage | Preserved on call |
---|---|---|
x0 | The constant value 0 | n/a |
x1 (ra) | The return address (link register) | yes |
x2 (sp) | Stack pointer | yes |
x3 (gp) | Global pointer | yes |
x4 (tp) | Thread pointer | yes |
x5-x7 | Temporaries | no |
x8-x9 | Saved | yes |
x10-x17 | Arguments/ results | no |
x18-x27 | Saved | yes |
x28-x31 | Temporaries | no |
There is an additional register which holds the address of the currently-executing instruction. This is called PC (Program Counter) [2, P. 9].
Registers have faster access times and higher throughput than memory, and so registers are preferred to memory when possible [1, P. 180].
Programs are normally register-constrained, i.e. programs often have more variables than there are registers. Spilling registers is the process of storing less frequently used variables in memory [1, P. 180].
A register-memory architecture allows operations to be performed on (or from) memory. x86 is a register-memory architecture.
A register-register architecture, also known as a load-store architecture, is an architecture where operations are divided into memory operations and operations that only occur between registers. RISC-V follows a load-store architecture—only load and store instructions access memory [1, Pp. 347-8].
RISC-V Instruction format
A RISC-V assembly instruction consists of an operation and a number of operands (the number of operands is instruction-specific):
operation operand_1, operand_2
Register operands hold a value that represents a register. They can either be source registers, where data is read from, or destination registers, where data is written to [1, P. 181]. For example:
add x1, x2, x3
Immediate operands (also known as constant operands) represent values that are encoded directly into an instruction [1, P. 181]. For example:
addi x1, x2, 0xff
An instruction format defines the layout of bits for a machine code instruction (which is usually generated from assembly code by an assembler).
RISC-V instructions are all 32-bit. The instructions are split into several formats which have different binary fields. All instructions have a 7-bit opcode field, which is used to determine the format of the instruction (and therefore how to interpret the rest of the instruction) [1, P. 198].
Note: not all ISAs use fixed-length instructions. x86, for example, has variable-length instructions.
An addressing mode defines how the machine language identifies the operands of each instruction. RISC-V has four addressing modes:
- Immediate addressing, where the operand is a constant within the instruction.
- Register addressing, where the operand represents a register.
- Base addressing, where the operand is at the memory location whose address is the sum of a register and a constant in the instruction (e.g. load and store).
- PC-relative addressing, where the branch address is the sum of the PC and a constant in the instruction
There are 6 32-bit instruction formats in RISC-V. These notes cover 3 for demonstration purposes. You can see a full list on the RISC-V green card.
R-type instruction
R-type instructions are used for register-register operations.
The opcode field represents part of the instruction opcode.
rd is the destination register operand.
funct3 is an additional opcode field.
rs1 is the first source register operand.
rs2 is the second source register operand.
funct7 is another additional opcode field.
I-type instruction
I-type instructions are used for register-immediate operations.
The immediate field holds an immediate value that’s interpreted as either two’s complement or an unsigned integer depending on the opcode.
rs1 is a source register.
funct3 is an additional opcode field.
rd is a destination register.
opcode is the opcode.
B-type instruction
B-type instructions are used for branch instructions [2, P. 17].
rs1 is the first source register operand.
rs2 is the second source register operand.
funct3 is an additional opcode field.
An immediate is then split between bits x-x and bits x-x.
The B-type is used to encode branch offsets that are in multiples of 2 and so the 0 bit is implicit [2, P. 12].
Computational instructions
Computational instructions use the ALU to perform computations. They include arithmetic operations, comparison operations, logical operations, and shift operations.
There are several RISC-V register-register instructions:
add
,sub
slt
(set if less than),sltu
(set if less than unsigned)and
,or
,xor
(bitwise logical operations)sll
(shift left logical),srl
(shift right logical),sra
(shift right arithmetic)
As well as equivalent register-immediate instructions:
addi
slti
,sltui
andi
,ori
,xori
slli
,srli
,srai
Note: there is no subi
since you can achieve the same effect by using addi
with a negative constant.
Arithmetic and logical operators are self-explanatory.
Shift operations move bits in doublewords to the left and to the right. Logical shifts fill in the blanks with 0s, arithmetic right shifts fill in the blanks with the sign bit.
Shifting left by bits is equivalent to multiplying the result by .
Control flow instructions
By default, instructions execute in sequence one after the other. Control flow instructions can alter the order of instruction execution.
Conditional branch instructions test two registers and branch (switch execution to a different instruction sequence) if the test passes, otherwise the processor will continue to execute the next instruction [1, P. 211]. Examples include:
beq
==
branch if equalbne
!=
branch if not equalblt
<
branch if less thanbge
>=
branch if greater than or equal tobltu
<
branch if less than unsignedbgeu
>
branch if greater than or equal to unsigned
Conditional branches are all B-type instructions with the assembly format op, rs1, rs2, l1
. rs1
and rs2
are the registers whose values are compared. l1
is a label, which is a symbol for an address.
RISC-V conditional branch instructions use PC-relative addressing. The target address for an instruction is calculated by adding a branch offset (which is stored as an immediate in the instruction) to the PC address () [1, P. 242].
For assembly instructions that branch to a label, the label address is first resolved by the assembler during compilation. If the address is close enough to the current address, then an offset is packed into the machine code instruction as a signed immediate to be added to the PC. If the label is too far away, then the assembly instruction is expanded into a conditional branch instruction and an unconditional branch instruction to the label address (which has more reach) [1, P. 257].
Unconditional branch instructions are called jumps. In RISC-V the jal
(jump-and-link) and jalr
(jump-and-link register) instructions are used to perform unconditional jumps [1, P. 216].
jal rd l1
jumps to a target specified by l1
(again this is converted to an offset from the instruction’s address by the assembler or linker). The address of the next instruction () is stored in register rd
. jal
uses the UJ-type format (where the J-immediate encodes a signed offset in multiples of 2 bytes) and has ± 1MB range RISC-V [2, P. 15].
A plain unconditional jump is written jal x0 l1
.
jalr rd imm(rs)
jumps to a target specified by the value of register rs
+ imm
, thus jalr
has greater range than jal
. Like jal
, is stored in register rd
.
A branch table is an array of words of addresses to branch to. They can be used to efficiently implement switch statements with multiple cases [1, P. 216].
Data transfer instructions
Data transfer instructions move data between memory and registers [1, P. 173].
A memory address is used to locate a data element in memory. For now, you can conceive of memory as a large single-dimension array where the address is the index and the value is the data that’s stored [1, P. 175].
RISC-V uses byte-addressing, meaning that hardware supports access to individual bytes [1, P. 177]. As an example, if there are two doublewords and stored contiguously in memory with preceding , if is stored at address then is stored at address .
Load instructions copy data from memory to a register. In RISC-V, ld
is the load doubleword instruction. The format of a load instruction is op rd, rs1, imm
where rd is the destination register that the data will be stored in. The address is calculated by using the value in rs1
and adding the constant imm
[1, P. 176].
Store instructions copy data from a register to memory. In RISC-V, sd
is the store doubleword instruction. The format of a store instruction is op rs1, rs2, imm
where rs1
is the register to be stored, rs2
is the base register, and imm
is an offset used to select the element [1, P. 179].
Note: an alignment restriction is a requirement that data must be aligned in memory based on some unit, e.g. 32-bit words must start at addresses that are multiples of 4 and 64-bit words must start at addresses that are multiples of 8. RISC-V isn’t byte-aligned, but MIPS is [1, P. 179].
A signed load performs sign extension (filling bits to the left with the left-most bit of the word being load) to keep the value of numbers that are less than 64-bit. lb
does signed load and sign extends the byte, whereas lbu
(load byte unsigned) treats the byte unsigned and zero extends [1, Pp. 189-90].
Memory layout
The specifics of a program’s memory layout depend on the system running the program as well as the architecture.
On Linux, a program’s memory space contains:
- Text segment
- Static data segment
- Stack
- Heap
- A reserved section
The text segment is “the segment of a UNIX object file that contains the machine language code for routines in the source file” [1, P. 230].
The static data segment contains constants and static variables that never change [1, P. 229].
Additional memory is available in the form of a stack and heap.
Stack
In computer architecture, a stack is a variable-size region of memory. It’s used for storing local variables and for preserving the values of registers during procedure calls.
A stack is a LIFO (Last-In-First-Out) data structure that supports the following abstract operations:
push(x)
addsx
to the stack.pop()
removes the top item from the stack.top()
returns the top item from the stack.
In RISC-V, stacks grow from higher addresses to lower addresses, so pushing to the stack decreases the stack pointer value and popping from the stack increases the stack pointer value [1, P. 222].
The RISC-V stack pointer is register x2, also known as sp.
The segment of a stack that contains a procedure’s saved registers is called a procedure frame (or activation record) [1, P. 227].
Some RISC-V compilers use a frame pointer register (fp) to point to the first doubleword of the current procedure’s frame. This is generally to make address calculations simpler since the frame pointer value doesn’t change during a procedure’s execution [1, P. 228].
Heap
A heap is a variable-sized region of memory intended for dynamic data structures [1, P. 229].
As opposed to a stack, a heap grow from lower addresses to higher addresses [1, P. 230].
In C, malloc()
allocates space on the heap and returns a pointer to it. free()
releases the space on the heap to which the pointer points [1, P. 230].
Procedures
A procedure is a stored (sometimes parameterized) subroutine that performs a task.
A caller is the program that calls a procedure, a callee is the procedure that is called by the caller [1, P. 220].
To execute a procedure, the program must follow these steps:
- Put parameters somewhere accessible to the procedure.
- Jump to the procedure.
- Acquire local storage resources needed for the procedure (allocated from the stack).
- Perform the desired logic.
- Put result in a place that the caller can access.
- Restore any registers used.
- Return control to the caller.
A calling convention is an agreement for how subroutines receive arguments, how they return a result, and what registers need to be saved between procedure calls.
RISC-V designates x10-x17 as parameter registers used to pass parameters and return values. ra (x1) contains the return address (the next address after the instruction that called the procedure) [1, P. 219].
Registers x5-x7 and x28-x31 are temporary registers that are not preserved by the callee during a procedure call [1, P. 224].
Registers x8-x9 and x18-x27 are saved registers. If a callee uses the registers then they must save and restore them as part of the procedure call [1, P. 224].
It can be useful to break a procedure into three parts:
- A prologue that performs setup.
- A body where the main logic runs.
- An epilogue that performs teardown.
To satisfy the guarantees of the RISC-V calling convention, the prologue will
- Decrement the stack pointer to allocate required memory on the stack.
- Store any saved registers that are accessed in the procedure.
- Store ra if a function call is made.
The epilogue will:
- Restore any saved registers that were used.
- Reload ra if required.
- Increment sp to its previous value.
- Jump back to the return address.
A tail call is a subroutine call made as the final act of a procedure. Tail call optimization is when tail calls are optimized. Tail calls can be optimized by not performing the jump-and-add-stack-frame and then pop-stack-frame-and-return-to-caller sequences. Instead the callee can reuse the existing stack frame, since it’s no longer needed by the caller.
References
- [1] L. Hennessy J and A. Patterson D, Computer Organization and Design: The Hardware / Software Interface: RISC-V Edition. 2018.
- [2] A. Waterman and K. et al. Asanovic, “The RISC-V Instruction Set Manual Volume II: Privileged Architecture Version 2.2,” EECS Department, University of California, Berkeley, 2017.
- [3] L. Hennessy J and A. Patterson D, Computer Organization and Design: The Hardware / Software Interface: ARM Edition. 2017.
- [4] N. Riasanovsky, “Understanding RISC-V Calling Convention.” .