-
-
Notifications
You must be signed in to change notification settings - Fork 63
Description
Nimbus EL is compiling with the flag -Werror=stack-usage= and this errors on allocas.
In particular this one https://github.com/status-im/nimbus-eth1/actions/runs/15843672213/job/44660951226?pr=3389#step:12:102 from status-im/nimbus-eth1#3389 is about this batch curve coordinates conversion:
In practice, batchAffine in the EVM would only be used for KZG_POINT_PRECOMPILE which is always compiled in (no dead-code elimination) as once you import the eth_evm_precompile file, everything is tagged exportc. And in-practice it is bounded to 16 blobs (or 64 with PeerDAS).
In any case, an optimization, using a destination buffer as temporary storage, is mentioned in the comment to remove the stack allocation which is not materialized in the code:
constantine/constantine/math/elliptic/ec_shortweierstrass_batch_ops.nim
Lines 84 to 108 in d2df757
| func batchAffine*[N: static int, F, G]( | |
| affs: var array[N, EC_ShortW_Aff[F, G]], | |
| projs: array[N, EC_ShortW_Prj[F, G]]) {.inline.} = | |
| batchAffine(affs.asUnchecked(), projs.asUnchecked(), N) | |
| func batchAffine*[F, G]( | |
| affs: ptr UncheckedArray[EC_ShortW_Aff[F, G]], | |
| jacs: ptr UncheckedArray[EC_ShortW_Jac[F, G]], | |
| N: int) {.noInline, tags:[Alloca], meter.} = | |
| # Algorithm: Montgomery's batch inversion | |
| # - Speeding the Pollard and Elliptic Curve Methods of Factorization | |
| # Section 10.3.1 | |
| # Peter L. Montgomery | |
| # https://www.ams.org/journals/mcom/1987-48-177/S0025-5718-1987-0866113-7/S0025-5718-1987-0866113-7.pdf | |
| # - Modern Computer Arithmetic | |
| # Section 2.5.1 Several inversions at once | |
| # Richard P. Brent and Paul Zimmermann | |
| # https://members.loria.fr/PZimmermann/mca/mca-cup-0.5.9.pdf | |
| # To avoid temporaries, we store partial accumulations | |
| # in affs[i].x and whether z == 0 in affs[i].y | |
| var zeroes = allocStackArray(SecretBool, N) | |
| affs[0].x = jacs[0].z | |
| zeroes[0] = affs[0].x.isZero() | |
| affs[0].x.csetOne(zeroes[0]) |
Note that while we can solve the batchAffine mentioned, we cannot do the same for batchInversion:
constantine/constantine/math/arithmetic/finite_fields.nim
Lines 911 to 945 in d2df757
| func batchInv*[F]( | |
| dst: ptr UncheckedArray[F], | |
| elements: ptr UncheckedArray[F], | |
| N: int) {.noInline.} = | |
| ## Batch inversion | |
| ## If an element is 0, the inverse stored will be 0. | |
| var zeros = allocStackArray(SecretBool, N) | |
| zeroMem(zeros, N) | |
| var acc: F | |
| acc.setOne() | |
| for i in 0 ..< N: | |
| # Skip zeros | |
| zeros[i] = elements[i].isZero() | |
| var z = elements[i] | |
| z.csetOne(zeros[i]) | |
| dst[i] = acc | |
| if i != N-1: | |
| acc.prod(acc, z, lazyReduce = true) | |
| else: | |
| acc.prod(acc, z, lazyReduce = false) | |
| acc.inv() | |
| for i in countdown(N-1, 0): | |
| # Extract 1/elemᵢ | |
| dst[i] *= acc | |
| dst[i].csetZero(zeros[i]) | |
| # next iteration | |
| var eli = elements[i] | |
| eli.csetOne(zeros[i]) | |
| acc.prod(acc, eli, lazyReduce = true) |
but I don't think it's called for any EVM precompile.