Skip to content

Generate RISC-V instruction decoder from ISA descriptor #103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jserv opened this issue Jan 2, 2023 · 6 comments
Open

Generate RISC-V instruction decoder from ISA descriptor #103

jserv opened this issue Jan 2, 2023 · 6 comments
Labels
help wanted Extra attention is needed

Comments

@jserv
Copy link
Contributor

jserv commented Jan 2, 2023

There is some relevant documentation included with the current RISC-V instructions decoding implementation. The maintenance and verification, however, are not straightforward. Instead, we may describe how RISC-V instructions are encoded in human readable form; a code generator will then convert this information into C code.
See make_decoder.py from arviss and HiSimu for reference.

Expected output:

  1. Create src/instructions.in which contains the following:
# format of a line in this file:
# <instruction name> <args> <opcode>
#
# <opcode> is given by specifying one or more range/value pairs:
# hi..lo=value or bit=value or arg=value (e.g. 6..2=0x45 10=1 rd=0)
#
# <args> is one of rd, rs1, rs2, rs3, imm20, imm12, imm12lo, imm12hi,
# shamtw, shamt, rm
# rv32i
beq     bimm12hi rs1 rs2 bimm12lo 14..12=0 6..2=0x18 1..0=3
bne     bimm12hi rs1 rs2 bimm12lo 14..12=1 6..2=0x18 1..0=3
blt     bimm12hi rs1 rs2 bimm12lo 14..12=4 6..2=0x18 1..0=3
bge     bimm12hi rs1 rs2 bimm12lo 14..12=5 6..2=0x18 1..0=3
bltu    bimm12hi rs1 rs2 bimm12lo 14..12=6 6..2=0x18 1..0=3
bgeu    bimm12hi rs1 rs2 bimm12lo 14..12=7 6..2=0x18 1..0=3
  1. Prepare scripts/gen-decoder.py (other scripting languages are acceptable.) which can convert from the above into the corresponding C implementation.
  2. Modify build system and src/decode.c to be aware of the above changes.
  3. Create an entry in directory docs which describe the high level idea and the way to describe more extensions.
@jserv
Copy link
Contributor Author

jserv commented Jan 5, 2023

rvjit provides similar code generation. See its scripts.

@jserv
Copy link
Contributor Author

jserv commented May 26, 2023

Google's mpact-riscv offers ISA description for the RV32/RV64 architecture. See riscv/*.isa for details.

@jserv
Copy link
Contributor Author

jserv commented Jun 22, 2023

riscvhpp is a user-level C++17 header-only RISC-V emulator generator using riscv-opcodes.

@jserv
Copy link
Contributor Author

jserv commented Jul 21, 2023

Google's mpact-riscv offers ISA description for the RV32/RV64 architecture. See riscv/*.isa for details.

MPACT-Sim provides a set of tools and C++ classes that makes it easier to write instruction level simulators for a wide range of architectures.

Build instructions:

$ git clone https://github.com/google/mpact-riscv
$ cd mpact-riscv
$ bazel build //...

@jserv
Copy link
Contributor Author

jserv commented Aug 6, 2023

Cavatools simulates a multi-core RISC-V machine. It provides "uspike," which is a RISC-V instruction set interpreter. Python scripts extract instruction bit encoding and execution semantics from the official GitHub repository.

@jserv jserv added the help wanted Extra attention is needed label Dec 25, 2023
@risc26z
Copy link

risc26z commented May 25, 2025

Hello,

I know of a few other decoder generators that are worth evaluating. There is opcode-decoder-generator, PIE, and Edigen (as well as my own creation from a few years ago, decgen). There is also a Rust tool named disarm64_gen. Academic research has produced quite a few descriptions of interesting-sounding decoder generators, including Vienna, Isildur, and several others - however, I've found it difficult to locate source code for these programs.

I think the nicest syntax is that used in PIE. It defines the ARM add instruction using the line:
add aaaa00b0 100cdddd eeeeffff ffffffff, a:auto_cond, b:immediate, c:set_condition, e:rd, d:rn, f:operand2
Which has the convenient property that a non-proportional font is all that's needed for debugging by eye.

I've noticed that several decoder generators produce relatively inefficient code, typically producing a massive tree of if-else statements that test a bit at a time. The result is very large code (with a heavy I-cache footprint) that produces large numbers of branch prediction misses. My earlier investigations showed that shallower decode trees employing multi-way switch statements were much more efficient. This is especially true when the ISA has a contiguous group of decode bits, or when 2 or more groups can be concatenated together (I'm keen to see what can be done with newer instructions such as the x86 PEXT instruction.) To find the 'best' decoder is a challenge: the difficulty lies in creating a mathematical model of the cost of various approaches in a way that's consistent with current hardware.

Jacob

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants