Skip to content

Conversation

domenicw
Copy link

@domenicw domenicw commented May 7, 2025

This PR upgrades the existing vector support from ~v0.7 to the ratified vector spec v1.0. A lot has changed since v0.7, so this is basically a complete rewrite of the vector setup. I think basically every single vector relevant line has been touched.

I am very sorry for the massive PR. But at this point it is too late to properly split it up into smaller chunks. And I think most (if not all) changes need to go in at the same time to make sense and not break the existing tool. I hope we can get it reviewed regardless and get this PR merged!

This PR includes the following changes:

  • Upgrade RVV support from v0.7 to v1.0
    • Add missing instructions
    • Remove obsolete instructions
    • Upgrade to redefined vector configuration (vtype)
    • Add missing vector CSRs
    • Properly constrain all vector instructions
    • Add support for vset[i]vl[i] interleaving into instruction stream (does not work with simultaneous directed scenarios)
    • Add support for vector embedded, Zvl<vl>b, Zvfhmin, and Zvfh extensions.
  • Remove support for obsolete vector AMO extension
  • Update directed vector load and store scenarios to support all flavours of vector load and store instructions
    • This also includes vector indexed load/stores with fully randomised indexes (can be initialised using vector slides instructions or randomised through LSFR)
    • All access will go into defined memory region and be valid
  • Fix randomisation under Questa/Model sim by setting RNG into unique state when doing shallow copy of instruction
  • Fix stack pointer increment/decrement in trap handler
  • Use CSR name instead of hex address for CSR instructions

The vector support has been designed such that all vector instructions will be valid. Meaning they are constrained that no instruction will throw an illegal instruction exception and no vector load/store instruction will throw a page/access/unaligned exception fault.

Regarding linting, I have run variable-lint and there are a few remaining line length warnings. How strongly is this enforced? There were also other warnings that were unrelated to my changes (some of which I have fixed), so does it need to be variable-lint clean to merge this PR?

I hope we can get this PR merged. We have had very good experience with the changes and the tool in general. It has uncovered a huge amount of bugs during our verification efforts, both in our custom IP and third party IPs. A lot of issues we would never had uncovered if we had to write custom test cases. So I still see a huge amount of value in RISCV-DV. And these changes hope to improve the tool even further.

domenicw added 30 commits May 7, 2025 07:06
@domenicw
Copy link
Author

domenicw commented May 7, 2025

@kgugala not sure who is the main maintainer these days. But you recently had some activity in this repo, so I will tag you here.

@domenicw domenicw force-pushed the domenicw/upstream/cleanup branch 2 times, most recently from 065458a to f4ec9b8 Compare May 7, 2025 11:10
@domenicw domenicw force-pushed the domenicw/upstream/cleanup branch from f4ec9b8 to a9e723b Compare May 7, 2025 11:11
@domenicw
Copy link
Author

domenicw commented May 7, 2025

So the CI is still failing. The issue is the cache actions on the self hosted runners. The generate-code jobs fail to actually cache the assemblies, therefore the dependent run-tests jobs will fail.

From the generate-code log:

12:06:32 | Failed to save: Unable to reserve cache with key cache_riscv_arithmetic_basic_test_uvm_d74469358f3c039425e56f6b637465406b1e3cd95d772afbfbd6e1c9f9b38c36, another job may be creating this cache. More details: This legacy service is shutting down, effective April 15, 2025. Migrate to the new service ASAP. For more information: https://gh.io/gha-cache-sunset

I have upgraded the cache actions to the latest version. This seems to work for the Spike jobs that is run on the GitHub hosted runner, but not on the self-hosted runner. Not sure why exactly, in the logs I can even see that the new version is used.
I have found this GitHub blog post. From there, it might be related with an out of date runner image or if one of the following ENV variable is used:

  • ACTIONS_CACHE_URL
  • ACTIONS_RESULTS_URL
  • ACTIONS_RUNTIME_TOKEN
  • ACTIONS_CACHE_SERVICE_V2

I have no view into the self hosted runner container, so I am unable to debug this further. I would appreciate it if one of the maintainers could have a look at it. My best guess is that it is an issue with the out of date CentOS 8 container.

@zeeshanrafique23
Copy link

zeeshanrafique23 commented Jun 23, 2025

Is someone reviewing this?

@Floadv298
Copy link

Floadv298 commented Jun 27, 2025

wrong PR, please ignore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants