RISC-V Assembler Reference

This document gives an overview of RISC-V assembly language. First, an introduction to assembler and linker concepts, then sections describing assembler directives, pseudo- instructions, relocation functions, and assembler concepts such as labels, relative and absolute addressing, immediate values, constants and finally control and status registers.

The accompanying RISC-V Instruction Set Reference contains a listing of instruction in the I (Base Integer Instruction Set) and M (Multiply and Divide) extension. For detailed information on the RISC-V instruction set refer to the RISC-V ISA Specification.

Concepts

This section briefly covers some high level concepts that are required to understand the process of assembling and linking executable code from source files.

Assembly file

An assembly file contains assembly language directives, macros and instructions. It can be emitted by a compiler or it can be handwritten. An assembly file is the input file to the assembler. The standard extensions for assembly files are .s and .S, with the later indicating that the assembly file should be pre-processed using the C preprocessor.

Relocatable Object file

A relocatable object file contains compiled object code and data emitted by the assembler. An object file cannot be run, rather it is used as input to the linker. The standard extension for object files is .o. The most common cross-platform file format for RISC-V executables is the ELF (Electronic Linker Format) object file format. The objdump utility can be used to disassemble an object file, objcopy can be used to copy and extract sections from ELF files and the nm utility can list symbols in an object file.

ELF Header

An ELF file has an ELF header that contains magic to indicate the file is ELF formatted, the architecture of the binary, the endianness of the binary (little-endian for RISC-V), the ELF file type (Relocatable Object File, Executable File, Shared Library), the number of program headers and their offset in the file, the number of section headers and their offset in the file, fields indicating the ELF version and ABI (Application Binary Interface) version of the file and finally flags indicating various ABI options such as RVC compression and which floating- point ABI that the executable code in the binary conforms to.

Program Header

Program Headers provide size and offsets of loadable segments within an executable file or shared object along with protection attributes used by the operating system (read, write and exec). Program headers are not present in relocatable object files and are primarily for use by the operating system to and dynamic linker to map code and data into memory.

Section Header

Section Headers provide size, offset, type, alignment and flags of the sections contained within the ELF file. Section headers are not required to execute a static binary but are necessary for dynamic linking as well as program linking. Various section types refer to the location of the symbol table, relocations and dynamic symbols in the ELF binary file.

Sections

An object file is made up of multiple sections, with each section corresponding to distinct types of executable code or data. There are a variety of different section types. This list shows the four most common sections:

.text is a read-only section containing executable code
.data is a read-write section containing global or static variables
.rodata is a read-only section containing const variables
.bss is a read-write section containing uninitialized data

Program linking

Program linking is the process of reading multiple relocatable object files, merging the sections from each of the source files, calculating the new addresses for symbols and applying relocation fixups to text or data that is pointed to in relocation entries.

Linker Script

A linker script is a text source file that is optionally input to the linker and it contains rules for the linker to use when calculating the load address and alignment of the various sections when creating an executable output file. The standard extension for linker scripts is .ld.

Assembler Directives

The assembler implements a number of directives that control the assembly of instructions into an object file. These directives give the ability to include arbitrary data in the object file, control exporting of symbols, selection of sections, alignment of data, assembly options for compression, position dependent and position independent code.

The following are assembler directives for emitting data:

Directive	Arguments	Description
`_.2byte`		_{16-bit comma separated words (unaligned)}
`_.4byte`		_{32-bit comma separated words (unaligned)}
`_.8byte`		_{64-bit comma separated words (unaligned)}
`_.half`		_{16-bit comma separated words (naturally aligned)}
`_.word`		_{32-bit comma separated words (naturally aligned)}
`_.dword`		_{64-bit comma separated words (naturally aligned)}
`_.byte`		_{8-bit comma separated words}
`_.dtpreldword`		_{64-bit thread local word}
`_.dtprelword`		_{32-bit thread local word}
`_.sleb128`	_expression	_{signed little endian base 128, DWARF}
`_.uleb128`	_expression	_{unsigned little endian base 128, DWARF}
`_.asciz`	_“string”	_{emit string (alias for .string)}
`_.string`	_“string”	_{emit string}
`_.incbin`	_{“filename”}	_{emit the included file as a binary sequence of octets}
`_.zero`	_integer	_{zero bytes}

The following are assembler directives for control of alignment:

Directive	Arguments	Description
`_.align`	_integer	_{align to power of 2 (alias for .p2align)}
`_.balign`	_{b,[pad_val=0]}	_{byte align}
`_.p2align`	_{p2,[pad_val=0],max}	_{align to power of 2}

The following are assembler directives for definition and exporing of symbols:

Directive	Arguments	Description
`_.globl`	_{symbol_name}	_{emit symbol_name to symbol table (scope GLOBAL)}
`_.local`	_{symbol_name}	_{emit symbol_name to symbol table (scope LOCAL)}
`_.equ`	_{name, value}	_{constant definition}

The following directives are for selection of sections:

Directive	Arguments	Description
`_.text`		_{emit .text section (if not present) and make current}
`_.data`		_{emit .data section (if not present) and make current}
`_.rodata`		_{emit .rodata section (if not present) and make current}
`_.bss`		_{emit .bss section (if not present) and make current}
`_.comm`	_{symbol_name,size,align}	_{emit common object to .bss section}
`_.common`	_{symbol_name,size,align}	_{emit common object to .bss section}
`_.section`	_{[{.text,.data,.rodata,.bss}]}	_{emit section (if not present, default .text) and make current}

The following directives includes options, macros and other miscellaneous functions:

Directive	Arguments	Description
`_.option`	_{{rvc,norvc,pic,nopic,push,pop}}	_{RISC-V options}
`_.macro`	_{name arg1 [, argn]}	_{begin macro definition \argname to substitute}
`_.endm`		_{end macro definition}
`_.file`	_{“filename”}	_{emit filename FILE LOCAL symbol table}
`_.ident`	_“string”	_{accepted for source compatibility}
`_.size`	_{symbol, symbol}	_{accepted for source compatibility}
`_.type`	_{symbol, @function}	_{accepted for source compatibility}

Assembler Pseudo-instructions

The assembler implements a number of convenience psuedo-instructions that are formed from instructions in the base ISA, but have implicit arguments or in some case reversed arguments, that result in distinct semantics.

The following table lists RISC-V assembler pseudo instructions:

Pesudo-instruction	Expansion	Description
`_nop`	`_{addi zero,zero,0}`	_{No operation}
`_{li rd, expression}`	_{(several expansions)}	_{Load immediate}
`_{la rd, symbol}`	_{(several expansions)}	_{Load address}
`_{mv rd, rs1}`	`_{addi rd, rs, 0}`	_{Copy register}
`_{not rd, rs1}`	`_{xori rd, rs, -1}`	_{One’s complement}
`_{neg rd, rs1}`	`_{sub rd, x0, rs}`	_{Two’s complement}
`_{negw rd, rs1}`	`_{subw rd, x0, rs}`	_{Two’s complement Word}
`_{sext.w rd, rs1}`	`_{addiw rd, rs, 0}`	_{Sign extend Word}
`_{seqz rd, rs1}`	`_{sltiu rd, rs, 1}`	_{Set if = zero}
`_{snez rd, rs1}`	`_{sltu rd, x0, rs}`	_{Set if ≠ zero}
`_{sltz rd, rs1}`	`_{slt rd, rs, x0}`	_{Set if < zero}
`_{sgtz rd, rs1}`	`_{slt rd, x0, rs}`	_{Set if > zero}
`_{fmv.s frd, frs1}`	`_{fsgnj.s frd, frs, frs}`	_{Single-precision move}
`_{fabs.s frd, frs1}`	`_{fsgnjx.s frd, frs, frs}`	_{Single-precision absolute value}
`_{fneg.s frd, frs1}`	`_{fsgnjn.s frd, frs, frs}`	_{Single-precision negate}
`_{fmv.d frd, frs1}`	`_{fsgnj.d frd, frs, frs}`	_{Double-precision move}
`_{fabs.d frd, frs1}`	`_{fsgnjx.d frd, frs, frs}`	_{Double-precision absolute value}
`_{fneg.d frd, frs1}`	`_{fsgnjn.d frd, frs, frs}`	_{Double-precision negate}
`_{beqz rs1, offset}`	`_{beq rs, x0, offset}`	_{Branch if = zero}
`_{bnez rs1, offset}`	`_{bne rs, x0, offset}`	_{Branch if ≠ zero}
`_{blez rs1, offset}`	`_{bge x0, rs, offset}`	_{Branch if ≤ zero}
`_{bgez rs1, offset}`	`_{bge rs, x0, offset}`	_{Branch if ≥ zero}
`_{bltz rs1, offset}`	`_{blt rs, x0, offset}`	_{Branch if < zero}
`_{bgtz rs1, offset}`	`_{blt x0, rs, offset}`	_{Branch if > zero}
`_{bgt rs, rt, offset}`	`_{blt rt, rs, offset}`	_{Branch if >}
`_{ble rs, rt, offset}`	`_{bge rt, rs, offset}`	_{Branch if ≤}
`_{bgtu rs, rt, offset}`	`_{bltu rt, rs, offset}`	_{Branch if >, unsigned}
`_{bleu rs, rt, offset}`	`_{bltu rt, rs, offset}`	_{Branch if ≤, unsigned}
`_{j offset}`	`_{jal x0, offset}`	_Jump
`_{jr offset}`	`_{jal x1, offset}`	_{Jump register}
`_ret`	`_{jalr x0, x1, 0}`	_{Return from subroutine}

Relocation Functions

The relocation function directives create synthesize operand values that are resolved at program link time and are used as immediate parameters to specific instructions. The sections on absolute and relative addressing give examples of using the relocation functions.

The following table lists assembler functions used to generate relocations:

Assembler Notation	Description	Instructions
`_%hi(symbol)`	_{Absolute (HI20)}	_lui
`_%lo(symbol)`	_{Absolute (LO12)}	_{loads, stores, adds}
`_{%pcrel_hi(symbol)}`	_{PC-relative (HI20)}	_auipc
`_{%pcrel_lo(label)}`	_{PC-relative (LO12)}	_{loads, stores, adds}
`_{%tprel_hi(symbol)}`	_{TLS LE (Local Exec)}	_auipc
`_{%tprel_lo(label)}`	_{TLS LE (Local Exec)}	_{loads, stores, adds}
`_{%tprel_add(offset)}`	_{TLS LE (Local Exec)}	_add

Labels

Text labels are used as branch, unconditional jump targets and symbol offsets. Text labels are added to the symbol table of the compiled module.

loop:
        j loop

Numeric labels are used for local references. References to local labels are suffixed with ‘f’ for a forward reference or ‘b’ for a backwards reference.

1:
        j 1b

Absolute Addressing

Absolute addresses are used in position dependent code. An absolute address is formed using two instructions, the U-Type lui (Load Upper Immediate) instruction to load bits[31:20] and an I-Type or S-Type instruction such as addi (add immediate), lw (load word) or sw (store word) that fills in the low 12 bits relative to the upper immediate.

The following example shows how to load an absolute address:

.section .text
.globl _start
_start:
	    lui  a1,      %hi(msg)       # load msg(hi)
	    addi a1, a1,  %lo(msg)       # load msg(lo)
	    jalr ra, puts
2:	    j    2b

.section .rodata
msg:
	    .string "Hello World\n"

which generates the following assembler output and relocations as seen by objdump:

0000000000000000 <_start>:
000005b7          	lui	a1,0x0
R_RISCV_HI20	msg
00858593          	addi	a1,a1,8 # 8 <.L21>
R_RISCV_LO12_I	msg

Relative Addressing

Relative addresses are used in position independent code. A PC-relative address is formed using two instructions, the U-Type auipc (Add Upper Immediate Program Counter) instruction to load bits[31:20] relative to the program counter of the auipc instruction followed by an I-Type or S-Type instruction such as addi (add immediate), lw (load word) or sw (store word).

The following example shows how to load a PC-relative address:

.section .text
.globl _start
_start:
1:	    auipc a1,     %pcrel_hi(msg) # load msg(hi)
	    addi  a1, a1, %pcrel_lo(1b)  # load msg(lo)
	    jalr  ra, puts
2:	    j     2b

.section .rodata
msg:
	    .string "Hello World\n"

which generates the following assembler output and relocations as seen by objdump:

0000000000000000 <_start>:
00000597          	auipc	a1,0x0
R_RISCV_PCREL_HI20	msg
00858593          	addi	a1,a1,8 # 8 <.L21>
R_RISCV_PCREL_LO12_I	.L11

Load Immediate

The li (load immediate) instruction is an assembler pseudo instruction that is used to synthesize constants. The li pseudo instruction will emit a sequence starting with lui followed by addi and slli (shift left logical immediate) to construct constants by shifting and adding.

The following example shows the li psuedo instruction being used to load an immediate value:

.section .text
.globl _start
_start:

.equ CONSTANT, 0xcafebabe

        li a0, CONSTANT

which generates the following assembler output as seen by objdump:

0000000000000000 <_start>:
   0:	00032537          	lui     a0,0x32
   4:	bfb50513          	addi    a0,a0,-1029
   8:	00e51513          	slli    a0,a0,0xe
   c:	abe50513          	addi    a0,a0,-1346

Load Address

The la (load address) instruction is an assembler pseudo- instruction used to load the address of a symbol or label. The instruction can emit absolute or relative addresses depending on the -fpic or -fno-pic assembler command line options or an .options pic or .options nopic assembler directive. The pseduo-instruction emits a relocation so that the address of the symbol can be fixed up during program linking.

The following example uses the la psuedo instruction to load a symbol address:

.section .text
.globl _start
_start:

        la a0, msg

.section .rodata
msg:
	    .string "Hello World\n"

which generates the following assembler output and relocations as seen by objdump:

0000000000000000 <_start>:
00000517          	auipc	a0,0x0
R_RISCV_PCREL_HI20	msg
00850513          	addi	a0,a0,8 # 8 <_start+0x8>
R_RISCV_PCREL_LO12_I	.L11

Constants

Constants are emitted to the symbol table of the object but they do not take any space in the code or data sections. Constants can be referenced in expressions which emit relocations.

The following example shows loading a constant using the %hi and %lo assembler functions.

.equ UART_BASE, 0x40003000

        lui a0,      %hi(UART_BASE)
        addi a0, a0, %lo(UART_BASE)

This example uses the li pseudo instruction to load a constant and writes a string using polled IO to a UART:

.equ UART_BASE,  0x40003000
.equ REG_RBR,    0
.equ REG_TBR,    0
.equ REG_IIR,    2
.equ IIR_TX_RDY, 2
.equ IIR_RX_RDY, 4

.section .text
.globl _start
_start:
1:      auipc a0, %pcrel_hi(msg)    # load msg(hi)
        addi  a0, a0, %pcrel_lo(1b)  # load msg(lo)
2:      jal   ra, puts
3:      j     3b

puts:
        li    a2, UART_BASE
1:      lbu   a1, (a0)
        beqz  a1, 3f
2:      lbu   a3, REG_IIR(a2)
        andi  a3, a3, IIR_TX_RDY
        beqz  a3, 2b
        sb    a1, REG_TBR(a2)
        addi  a0, a0, 1
        j     1b
3:      ret

.section .rodata
msg:
	    .string "Hello World\n"

Control and Status Registers

Control and status registers are typically used to update privileged processor state however there are a few non-privileged instructions that access control and status registers such as the CSR pseudo-instructions rdcycle, rdtime, rdinstret for access to counters and frcsr, frrm, frflags, fscsr, fsrm, fsflags, fsrmi and fsflagsi for controling round mode and accessing floating point accrued exception state.

The following instructions allow reading, writing, setting and clearing bits in CSRs (control and status registers):

CSR Operation	Description
`_{CSRRW rd, csr, rs1}`	_{Control and Status Register Atomic Read and Write}
`_{CSRRS rd, csr, rs1}`	_{Control and Status Register Atomic Read and Set Bits}
`_{CSRRC rd, csr, rs1}`	_{Control and Status Register Atomic Read and Clear Bits}
`_{CSRRWI rd, csr, imm5}`	_{Control and Status Register Atomic Read and Write Immediate}
`_{CSRRSI rd, csr, imm5}`	_{Control and Status Register Atomic Read and Set Bits Immediate}
`_{CSRRCI rd, csr, imm5}`	_{Control and Status Register Atomic Read and Write Immediate}

The following code sample shows how to enable interrupts, enable timer interuppts, and then set and wait for a timer interrupt to occur. The example uses CSR instructions and access to a platform specific MMIO (memory mapped input output) region:

.equ RTC_BASE,      0x40000000
.equ TIMER_BASE,    0x40004000

# setup machine trap vector
1:      auipc   t0, %pcrel_hi(mtvec)        # load mtvec(hi)
        addi    t0, t0, %pcrel_lo(1b)       # load mtvec(lo)
        csrrw   zero, mtvec, t0

# set mstatus.MIE=1 (enable M mode interrupt)
        li      t0, 8
        csrrs   zero, mstatus, t0

# set mie.MTIE=1 (enable M mode timer interrupts)
        li      t0, 128
        csrrs   zero, mie, t0

# read from mtime
        li      a0, RTC_BASE
        ld      a1, 0(a0)

# write to mtimecmp
        li      a0, TIMER_BASE
        li      t0, 1000000000
        add     a1, a1, t0
        sd      a1, 0(a0)

# loop
loop:
        wfi
        j loop

# break on interrupt
mtvec:
        csrrc   t0, mcause, zero
        bgez    t0, fail       # interrupt causes are less than zero
        slli    t0, t0, 1      # shift off high bit
        srli    t0, t0, 1
        li      t1, 7          # check this is an m_timer interrupt
        bne t0, t1, fail
        j pass

pass:
        la      a0, pass_msg
        jal     puts
        j       shutdown

fail:
        la      a0, fail_msg
        jal     puts
        j       shutdown

.section .rodata

pass_msg:
        .string "PASS\n"

fail_msg:
        .string "FAIL\n"