-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement format-specific encoding mechanism #546
base: master
Are you sure you want to change the base?
Conversation
The current instruction encoding mechanism for RISC-V in Sail uses a simple mapping, from “ast” to a 32-bit opcode value (ignoring compressed instructions for now). For example: ``` union clause ast = RISCV_JALR : (bits(12), regidx, regidx) mapping clause encdec = RISCV_JALR(imm, rs1, rd) <-> imm @ rs1 @ 0b000 @ rd @ 0b1100111 ``` This is certainly functional. However, what’s missing from this representation is: - Any notion of the opcode format - Any enforcement mechanism for the order of the respective bit fields with respect to the opcode format - Any enforcement mechanism for the sizes of the respective bit fields A *wrong* encoding representation would readily compile: ``` mapping clause encdec = RISCV_JALR(imm, rs1, rd) <-> 0b1100111 @ rd @ rs1 @ 0b000 @ imm /* WRONG */ ``` This would (hopefully) be caught in runtime testing, of course. This (draft) PR proposes an encoding mechanism with enhancements: - Each opcode is clearly associated with an encoding format - Each instruction only provides a structured list of encoding-agnostic opcode inputs - A format-specific, instruction-agnostic encoding mechanism is provided This forces instructions in the Sail code to clearly identify their formats, and allows them to provide the associated inputs for the encoding without concern for the specifics of the encoding. It also isolates the encoding scheme to a small set of format-specific encodings separate from the instruction definitions. An example for format-specific encoding: ``` enum Format = { R_Format, U_Format, I_Format, J_Format, S_Format, B_Format, Unknown_Format } val opcode2format : bits(7) -> Format scattered function opcode2format union instruction_input = { IFormat: { imm: bits(12), rs1: regidx, funct3: bits(3), rd: regidx, opcode: bits(7) }, [...] } val encdec : ast <-> instruction_input scattered mapping encdec mapping fmt2bits : instruction_input <-> bits(32) = { IFormat(struct { imm = imm, rs1 = rs1, funct3 = funct3, rd = rd, opcode = opcode }) if opcode2format(opcode) == I_Format <-> imm @ rs1 @ funct3 @ rd @ opcode if opcode2format(opcode) == I_Format, [...] } ``` An example of a instruction’s encoding information: ``` function clause opcode2format 0b1100111 = I_Format mapping clause encdec = RISCV_JALR(imm, rs1, rd) <-> IFormat(struct { imm = imm, rs1 = rs1, funct3 = 0b000, rd = rd, opcode = 0b1100111 }) ```
Also, please see the commit message in the "[DRAFT]" commit, which explains the rationale for this work. |
Test Results230 tests - 482 57 ✅ - 655 0s ⏱️ ±0s For more details on these failures, see this check. Results for commit 738896b. ± Comparison against base commit d36ea53. This pull request removes 482 tests.
♻️ This comment has been updated with latest results. |
This is expected. This Draft PR has a working framework, but is far from complete/ready. |
Actually, some passes were expected. I had an issue with the first attempt because I'm still using Sail 0.17.1 and the CI uses 0.18, and I didn't properly |
I'm not sure I understand the purpose of the |
I think I would expect something like:
Names might need to change, but this is following the structure of the ISA manual. It needs to be scattered so extensions could slot into the custom opcode spaces, even though it's very verbose... |
Thanks for the comments, @Alasdair. My ignorance is showing, I know, but I don't understand how the above mappings help if I tried a very simple experiment (based on my very limited Sail knowledge), and removed the conditions from one of the
... and then at runtime, the mappings are broken:
How would your suggestions be used to discriminate formats based on opcode? |
My concern in general is that opcode formats are a general rule of thumb, not a rigid scheme that every instruction must adhere to in the exact same way (even in the base I extension, shifts aren't really I-type, only if you squint, despite what the manual says). And I'm not sure how much reducing the repetition even matters, nor if it improves readability. |
This is easily accommodated, I think, by adding more I would imagine, also, that someone implementing a decoder would prefer well-defined formats.
This is because of the extra "shift amount" bit for RV64, noted as a "specialization of the I-type format"? Yeah.
Reducing the repetition of what? The bit concatenations? Reducing repetition of that was not a goal. Adding structure to the encoding mechanism was a goal. Refactoring that into a few per-format encoding schemes was an obvious way to do that, with additional benefits. Improving readability was not an explicit goal, although I'd argue the new code is more pleasing to the eye than the old code:
|
This is a draft, with some extraneous commits included that greatly facilitate testing. The only important commit is marked "[DRAFT]", and reviews should be confined to that.
There's still a lot of content, so if I may direct the reviewers eyes:
model/riscv_insts_begin.sail
: This sets up the new global data content. In particular:union instruction_input
: A tagged union for all of the inputs for each instruction format.scattered mapping fmtencdec
: A new mapping fromast
toinstruction_input
. Intended to replacescattered mapping encdec
.enum Format
: one enum value per format.scattered function opcode2format
: maps 7-bit opcode values to the associated format enum.mapping fmt2bits
: A new mapping frominstruction_input
to the 32-bit opcode value. This contains one bidirectional mapping for each format. This is a central location for enforcing opcode layouts, including field order and width, as well as important things like clipping low-order bits of field values where those bits have presumed/enforced values.model/riscv_insts_base.sail
,model/riscv_insts_next.sail
,model/riscv_insts_zicsr.sail
:opcode2format
to explicitly bind an instruction to its respective format.fmtencdec
: to mapast
inputs and constant values to the appropriateinstruction_input
tagged union member.encdec
(renamedoldencdec
) is obsoleted, and can be ignored, but I left it in for comparison/review purposes.