-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[libc] Add support for MVE to Arm startup code #167338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@llvm/pr-subscribers-libc Author: Victor Campos (vhscampos) ChangesIn order to have MVE support, the same bits of the CPACR register that enable the floating-point extension must be set. Full diff: https://github.com/llvm/llvm-project/pull/167338.diff 1 Files Affected:
diff --git a/libc/startup/baremetal/arm/start.cpp b/libc/startup/baremetal/arm/start.cpp
index 4740067722022..b22529f214216 100644
--- a/libc/startup/baremetal/arm/start.cpp
+++ b/libc/startup/baremetal/arm/start.cpp
@@ -131,20 +131,23 @@ namespace LIBC_NAMESPACE_DECL {
__arm_wsr("CPSR_c", 0x13); // SVC
#endif
-#ifdef __ARM_FP
-// Enable FPU
-#if __ARM_ARCH_PROFILE == 'M'
+#if __ARM_ARCH_PROFILE == 'M' && \
+ (defined(__ARM_FP) || defined(__ARM_FEATURE_MVE))
+ // Enable FPU and MVE. They can't be enabled independently: the two are
+ // governed by the same bits in CPACR.
// Based on
// https://developer.arm.com/documentation/dui0646/c/Cortex-M7-Peripherals/Floating-Point-Unit/Enabling-the-FPU
- // Set CPACR cp10 and cp11
+ // Set CPACR cp10 and cp11.
auto cpacr = (volatile uint32_t *const)0xE000ED88;
*cpacr |= (0xF << 20);
__dsb(0xF);
__isb(0xF);
-#elif __ARM_ARCH_PROFILE == 'A' || __ARM_ARCH_PROFILE == 'R'
+#elif (__ARM_ARCH_PROFILE == 'A' || __ARM_ARCH_PROFILE == 'R') && \
+ defined(__ARM_FP)
+ // Enable FPU.
// Based on
// https://developer.arm.com/documentation/dui0472/m/Compiler-Coding-Practices/Enabling-NEON-and-FPU-for-bare-metal
- // Set CPACR cp10 and cp11
+ // Set CPACR cp10 and cp11.
uint32_t cpacr = __arm_rsr("p15:0:c1:c0:2");
cpacr |= (0xF << 20);
__arm_wsr("p15:0:c1:c0:2", cpacr);
@@ -154,7 +157,6 @@ namespace LIBC_NAMESPACE_DECL {
__asm__ __volatile__("vmrs %0, FPEXC" : "=r"(fpexc) : :);
fpexc |= (1 << 30);
__asm__ __volatile__("vmsr FPEXC, %0" : : "r"(fpexc) :);
-#endif
#endif
// Perform the equivalent of scatterloading
|
libc/startup/baremetal/arm/start.cpp
Outdated
| // https://developer.arm.com/documentation/dui0646/c/Cortex-M7-Peripherals/Floating-Point-Unit/Enabling-the-FPU | ||
| // Set CPACR cp10 and cp11 | ||
| // Set CPACR cp10 and cp11. | ||
| auto cpacr = (volatile uint32_t *const)0xE000ED88; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Do you mind changing this to a C++ reinterpret_cast?
lntue
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a nit.
statham-arm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In MVE, it's also important to initialize the LTPSIZE field in FPSCR, because on at least some platforms it starts off at 0, but 0 means tail predication is currently active. I think it would be a good idea to set it to 4 ("no tail predication") at startup, if there's a piece of code that knows it's enabling MVE at all.
In order to have MVE support, the same bits of the CPACR register that enable the floating-point extension must be set.
3ca0f6b to
77b94bd
Compare
statham-arm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one remaining comment nit, but otherwise LGTM.
libc/startup/baremetal/arm/start.cpp
Outdated
| __isb(0xF); | ||
| #elif __ARM_ARCH_PROFILE == 'A' || __ARM_ARCH_PROFILE == 'R' | ||
| #if defined(__ARM_FEATURE_MVE) | ||
| // Set FPSCR's LTPSIZE field to 4 to disable low-overhead-loop tail |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to nitpick, but this comment could be misleading. "Disabling" tail predication sounds like making it impossible for people to use it, and we're not doing that. We're only ensuring that it doesn't accidentally trigger at startup.
Perhaps an alternative wording along the lines of "initialize low-overhead-loop tail predication to its neutral state"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I used your suggestion verbatim as I can't come up with something better anyway
In order to have MVE support, the same bits of the CPACR register that enable the floating-point extension must be set.