Skip to content

Commit efad350

Browse files
Merge #906: Use modified divsteps with initial delta=1/2 for constant-time
be0609f Add unit tests for edge cases with delta=1/2 variant of divsteps (Pieter Wuille) cd393ce Optimization: only do 59 hddivsteps per iteration instead of 62 (Pieter Wuille) 277b224 Use modified divsteps with initial delta=1/2 for constant-time (Pieter Wuille) 376ca36 Fix typo in explanation (Pieter Wuille) Pull request description: This updates the divsteps-based modular inverse code to use the modified version which starts with delta=1/2. For variable time, the delta=1 variant is still used as it appears to be faster. See https://github.com/sipa/safegcd-bounds/tree/master/coq and https://medium.com/blockstream/a-formal-proof-of-safegcd-bounds-695e1735a348 for a proof of correctness of this variant. TODO: * [x] Update unit tests to include edge cases specific to this variant I'm still running the Coq proof verification for the 590 bound in non-native mode. It's unclear how long this will take. ACKs for top commit: gmaxwell: ACK be0609f sanket1729: crACK be0609f real-or-random: ACK be0609f careful code review and some testing Tree-SHA512: 2f8f400ba3ac8dbd08622d564c3b3e5ff30768bd0eb559f2c4279c6c813e17cdde71b1c16f05742c5657b5238b4d592b48306f9f47d7dbdb57907e58dd99b47a
2 parents cc2c09e + be0609f commit efad350

File tree

4 files changed

+745
-101
lines changed

4 files changed

+745
-101
lines changed

doc/safegcd_implementation.md

+28-13
Original file line numberDiff line numberDiff line change
@@ -244,8 +244,8 @@ def modinv(M, Mi, x):
244244

245245
This means that in practice we'll always perform a multiple of *N* divsteps. This is not a problem
246246
because once *g=0*, further divsteps do not affect *f*, *g*, *d*, or *e* anymore (only *δ* keeps
247-
increasing). For variable time code such excess iterations will be mostly optimized away in
248-
section 6.
247+
increasing). For variable time code such excess iterations will be mostly optimized away in later
248+
sections.
249249

250250

251251
## 4. Avoiding modulus operations
@@ -519,6 +519,20 @@ computation:
519519
g >>= 1
520520
```
521521

522+
A variant of divsteps with better worst-case performance can be used instead: starting *δ* at
523+
*1/2* instead of *1*. This reduces the worst case number of iterations to *590* for *256*-bit inputs
524+
(which can be shown using convex hull analysis). In this case, the substitution *ζ=-(δ+1/2)*
525+
is used instead to keep the variable integral. Incrementing *δ* by *1* still translates to
526+
decrementing *ζ* by *1*, but negating *δ* now corresponds to going from *ζ* to *-(ζ+1)*, or
527+
*~ζ*. Doing that conditionally based on *c3* is simply:
528+
529+
```python
530+
...
531+
c3 = c1 & c2
532+
zeta ^= c3
533+
...
534+
```
535+
522536
By replacing the loop in `divsteps_n_matrix` with a variant of the divstep code above (extended to
523537
also apply all *f* operations to *u*, *v* and all *g* operations to *q*, *r*), a constant-time version of
524538
`divsteps_n_matrix` is obtained. The full code will be in section 7.
@@ -535,7 +549,8 @@ other cases, it slows down calculations unnecessarily. In this section, we will
535549
faster non-constant time `divsteps_n_matrix` function.
536550

537551
To do so, first consider yet another way of writing the inner loop of divstep operations in
538-
`gcd` from section 1. This decomposition is also explained in the paper in section 8.2.
552+
`gcd` from section 1. This decomposition is also explained in the paper in section 8.2. We use
553+
the original version with initial *δ=1* and *η=-δ* here.
539554

540555
```python
541556
for _ in range(N):
@@ -643,24 +658,24 @@ All together we need the following functions:
643658
section 5, extended to handle *u*, *v*, *q*, *r*:
644659

645660
```python
646-
def divsteps_n_matrix(eta, f, g):
647-
"""Compute eta and transition matrix t after N divsteps (multiplied by 2^N)."""
661+
def divsteps_n_matrix(zeta, f, g):
662+
"""Compute zeta and transition matrix t after N divsteps (multiplied by 2^N)."""
648663
u, v, q, r = 1, 0, 0, 1 # start with identity matrix
649664
for _ in range(N):
650-
c1 = eta >> 63
665+
c1 = zeta >> 63
651666
# Compute x, y, z as conditionally-negated versions of f, u, v.
652667
x, y, z = (f ^ c1) - c1, (u ^ c1) - c1, (v ^ c1) - c1
653668
c2 = -(g & 1)
654669
# Conditionally add x, y, z to g, q, r.
655670
g, q, r = g + (x & c2), q + (y & c2), r + (z & c2)
656671
c1 &= c2 # reusing c1 here for the earlier c3 variable
657-
eta = (eta ^ c1) - (c1 + 1) # inlining the unconditional eta decrement here
672+
zeta = (zeta ^ c1) - 1 # inlining the unconditional zeta decrement here
658673
# Conditionally add g, q, r to f, u, v.
659674
f, u, v = f + (g & c1), u + (q & c1), v + (r & c1)
660675
# When shifting g down, don't shift q, r, as we construct a transition matrix multiplied
661676
# by 2^N. Instead, shift f's coefficients u and v up.
662677
g, u, v = g >> 1, u << 1, v << 1
663-
return eta, (u, v, q, r)
678+
return zeta, (u, v, q, r)
664679
```
665680

666681
- The functions to update *f* and *g*, and *d* and *e*, from section 2 and section 4, with the constant-time
@@ -681,7 +696,7 @@ def update_de(d, e, t, M, Mi):
681696
cd, ce = (u*d + v*e) % 2**N, (q*d + r*e) % 2**N
682697
md -= (Mi*cd + md) % 2**N
683698
me -= (Mi*ce + me) % 2**N
684-
cd, ce = u*d + v*e + Mi*md, q*d + r*e + Mi*me
699+
cd, ce = u*d + v*e + M*md, q*d + r*e + M*me
685700
return cd >> N, ce >> N
686701
```
687702

@@ -702,15 +717,15 @@ def normalize(sign, v, M):
702717
return v
703718
```
704719

705-
- And finally the `modinv` function too, adapted to use *&eta;* instead of *&delta;*, and using the fixed
720+
- And finally the `modinv` function too, adapted to use *&zeta;* instead of *&delta;*, and using the fixed
706721
iteration count from section 5:
707722

708723
```python
709724
def modinv(M, Mi, x):
710725
"""Compute the modular inverse of x mod M, given Mi=1/M mod 2^N."""
711-
eta, f, g, d, e = -1, M, x, 0, 1
712-
for _ in range((724 + N - 1) // N):
713-
eta, t = divsteps_n_matrix(-eta, f % 2**N, g % 2**N)
726+
zeta, f, g, d, e = -1, M, x, 0, 1
727+
for _ in range((590 + N - 1) // N):
728+
zeta, t = divsteps_n_matrix(zeta, f % 2**N, g % 2**N)
714729
f, g = update_fg(f, g, t)
715730
d, e = update_de(d, e, t, M, Mi)
716731
return normalize(f, d, M)

src/modinv32_impl.h

+21-21
Original file line numberDiff line numberDiff line change
@@ -168,17 +168,17 @@ typedef struct {
168168
int32_t u, v, q, r;
169169
} secp256k1_modinv32_trans2x2;
170170

171-
/* Compute the transition matrix and eta for 30 divsteps.
171+
/* Compute the transition matrix and zeta for 30 divsteps.
172172
*
173-
* Input: eta: initial eta
174-
* f0: bottom limb of initial f
175-
* g0: bottom limb of initial g
173+
* Input: zeta: initial zeta
174+
* f0: bottom limb of initial f
175+
* g0: bottom limb of initial g
176176
* Output: t: transition matrix
177-
* Return: final eta
177+
* Return: final zeta
178178
*
179179
* Implements the divsteps_n_matrix function from the explanation.
180180
*/
181-
static int32_t secp256k1_modinv32_divsteps_30(int32_t eta, uint32_t f0, uint32_t g0, secp256k1_modinv32_trans2x2 *t) {
181+
static int32_t secp256k1_modinv32_divsteps_30(int32_t zeta, uint32_t f0, uint32_t g0, secp256k1_modinv32_trans2x2 *t) {
182182
/* u,v,q,r are the elements of the transformation matrix being built up,
183183
* starting with the identity matrix. Semantically they are signed integers
184184
* in range [-2^30,2^30], but here represented as unsigned mod 2^32. This
@@ -193,8 +193,8 @@ static int32_t secp256k1_modinv32_divsteps_30(int32_t eta, uint32_t f0, uint32_t
193193
VERIFY_CHECK((f & 1) == 1); /* f must always be odd */
194194
VERIFY_CHECK((u * f0 + v * g0) == f << i);
195195
VERIFY_CHECK((q * f0 + r * g0) == g << i);
196-
/* Compute conditional masks for (eta < 0) and for (g & 1). */
197-
c1 = eta >> 31;
196+
/* Compute conditional masks for (zeta < 0) and for (g & 1). */
197+
c1 = zeta >> 31;
198198
c2 = -(g & 1);
199199
/* Compute x,y,z, conditionally negated versions of f,u,v. */
200200
x = (f ^ c1) - c1;
@@ -204,10 +204,10 @@ static int32_t secp256k1_modinv32_divsteps_30(int32_t eta, uint32_t f0, uint32_t
204204
g += x & c2;
205205
q += y & c2;
206206
r += z & c2;
207-
/* In what follows, c1 is a condition mask for (eta < 0) and (g & 1). */
207+
/* In what follows, c1 is a condition mask for (zeta < 0) and (g & 1). */
208208
c1 &= c2;
209-
/* Conditionally negate eta, and unconditionally subtract 1. */
210-
eta = (eta ^ c1) - (c1 + 1);
209+
/* Conditionally change zeta into -zeta-2 or zeta-1. */
210+
zeta = (zeta ^ c1) - 1;
211211
/* Conditionally add g,q,r to f,u,v. */
212212
f += g & c1;
213213
u += q & c1;
@@ -216,8 +216,8 @@ static int32_t secp256k1_modinv32_divsteps_30(int32_t eta, uint32_t f0, uint32_t
216216
g >>= 1;
217217
u <<= 1;
218218
v <<= 1;
219-
/* Bounds on eta that follow from the bounds on iteration count (max 25*30 divsteps). */
220-
VERIFY_CHECK(eta >= -751 && eta <= 751);
219+
/* Bounds on zeta that follow from the bounds on iteration count (max 20*30 divsteps). */
220+
VERIFY_CHECK(zeta >= -601 && zeta <= 601);
221221
}
222222
/* Return data in t and return value. */
223223
t->u = (int32_t)u;
@@ -229,7 +229,7 @@ static int32_t secp256k1_modinv32_divsteps_30(int32_t eta, uint32_t f0, uint32_t
229229
* will be divided out again). As each divstep's individual matrix has determinant 2, the
230230
* aggregate of 30 of them will have determinant 2^30. */
231231
VERIFY_CHECK((int64_t)t->u * t->r - (int64_t)t->v * t->q == ((int64_t)1) << 30);
232-
return eta;
232+
return zeta;
233233
}
234234

235235
/* Compute the transition matrix and eta for 30 divsteps (variable time).
@@ -453,19 +453,19 @@ static void secp256k1_modinv32_update_fg_30_var(int len, secp256k1_modinv32_sign
453453

454454
/* Compute the inverse of x modulo modinfo->modulus, and replace x with it (constant time in x). */
455455
static void secp256k1_modinv32(secp256k1_modinv32_signed30 *x, const secp256k1_modinv32_modinfo *modinfo) {
456-
/* Start with d=0, e=1, f=modulus, g=x, eta=-1. */
456+
/* Start with d=0, e=1, f=modulus, g=x, zeta=-1. */
457457
secp256k1_modinv32_signed30 d = {{0}};
458458
secp256k1_modinv32_signed30 e = {{1}};
459459
secp256k1_modinv32_signed30 f = modinfo->modulus;
460460
secp256k1_modinv32_signed30 g = *x;
461461
int i;
462-
int32_t eta = -1;
462+
int32_t zeta = -1; /* zeta = -(delta+1/2); delta is initially 1/2. */
463463

464-
/* Do 25 iterations of 30 divsteps each = 750 divsteps. 724 suffices for 256-bit inputs. */
465-
for (i = 0; i < 25; ++i) {
466-
/* Compute transition matrix and new eta after 30 divsteps. */
464+
/* Do 20 iterations of 30 divsteps each = 600 divsteps. 590 suffices for 256-bit inputs. */
465+
for (i = 0; i < 20; ++i) {
466+
/* Compute transition matrix and new zeta after 30 divsteps. */
467467
secp256k1_modinv32_trans2x2 t;
468-
eta = secp256k1_modinv32_divsteps_30(eta, f.v[0], g.v[0], &t);
468+
zeta = secp256k1_modinv32_divsteps_30(zeta, f.v[0], g.v[0], &t);
469469
/* Update d,e using that transition matrix. */
470470
secp256k1_modinv32_update_de_30(&d, &e, &t, modinfo);
471471
/* Update f,g using that transition matrix. */
@@ -515,7 +515,7 @@ static void secp256k1_modinv32_var(secp256k1_modinv32_signed30 *x, const secp256
515515
int i = 0;
516516
#endif
517517
int j, len = 9;
518-
int32_t eta = -1;
518+
int32_t eta = -1; /* eta = -delta; delta is initially 1 (faster for the variable-time code) */
519519
int32_t cond, fn, gn;
520520

521521
/* Do iterations of 30 divsteps each until g=0. */

src/modinv64_impl.h

+33-29
Original file line numberDiff line numberDiff line change
@@ -145,33 +145,35 @@ typedef struct {
145145
int64_t u, v, q, r;
146146
} secp256k1_modinv64_trans2x2;
147147

148-
/* Compute the transition matrix and eta for 62 divsteps.
148+
/* Compute the transition matrix and eta for 59 divsteps (where zeta=-(delta+1/2)).
149+
* Note that the transformation matrix is scaled by 2^62 and not 2^59.
149150
*
150-
* Input: eta: initial eta
151-
* f0: bottom limb of initial f
152-
* g0: bottom limb of initial g
151+
* Input: zeta: initial zeta
152+
* f0: bottom limb of initial f
153+
* g0: bottom limb of initial g
153154
* Output: t: transition matrix
154-
* Return: final eta
155+
* Return: final zeta
155156
*
156157
* Implements the divsteps_n_matrix function from the explanation.
157158
*/
158-
static int64_t secp256k1_modinv64_divsteps_62(int64_t eta, uint64_t f0, uint64_t g0, secp256k1_modinv64_trans2x2 *t) {
159+
static int64_t secp256k1_modinv64_divsteps_59(int64_t zeta, uint64_t f0, uint64_t g0, secp256k1_modinv64_trans2x2 *t) {
159160
/* u,v,q,r are the elements of the transformation matrix being built up,
160-
* starting with the identity matrix. Semantically they are signed integers
161+
* starting with the identity matrix times 8 (because the caller expects
162+
* a result scaled by 2^62). Semantically they are signed integers
161163
* in range [-2^62,2^62], but here represented as unsigned mod 2^64. This
162164
* permits left shifting (which is UB for negative numbers). The range
163165
* being inside [-2^63,2^63) means that casting to signed works correctly.
164166
*/
165-
uint64_t u = 1, v = 0, q = 0, r = 1;
167+
uint64_t u = 8, v = 0, q = 0, r = 8;
166168
uint64_t c1, c2, f = f0, g = g0, x, y, z;
167169
int i;
168170

169-
for (i = 0; i < 62; ++i) {
171+
for (i = 3; i < 62; ++i) {
170172
VERIFY_CHECK((f & 1) == 1); /* f must always be odd */
171173
VERIFY_CHECK((u * f0 + v * g0) == f << i);
172174
VERIFY_CHECK((q * f0 + r * g0) == g << i);
173-
/* Compute conditional masks for (eta < 0) and for (g & 1). */
174-
c1 = eta >> 63;
175+
/* Compute conditional masks for (zeta < 0) and for (g & 1). */
176+
c1 = zeta >> 63;
175177
c2 = -(g & 1);
176178
/* Compute x,y,z, conditionally negated versions of f,u,v. */
177179
x = (f ^ c1) - c1;
@@ -181,10 +183,10 @@ static int64_t secp256k1_modinv64_divsteps_62(int64_t eta, uint64_t f0, uint64_t
181183
g += x & c2;
182184
q += y & c2;
183185
r += z & c2;
184-
/* In what follows, c1 is a condition mask for (eta < 0) and (g & 1). */
186+
/* In what follows, c1 is a condition mask for (zeta < 0) and (g & 1). */
185187
c1 &= c2;
186-
/* Conditionally negate eta, and unconditionally subtract 1. */
187-
eta = (eta ^ c1) - (c1 + 1);
188+
/* Conditionally change zeta into -zeta-2 or zeta-1. */
189+
zeta = (zeta ^ c1) - 1;
188190
/* Conditionally add g,q,r to f,u,v. */
189191
f += g & c1;
190192
u += q & c1;
@@ -193,8 +195,8 @@ static int64_t secp256k1_modinv64_divsteps_62(int64_t eta, uint64_t f0, uint64_t
193195
g >>= 1;
194196
u <<= 1;
195197
v <<= 1;
196-
/* Bounds on eta that follow from the bounds on iteration count (max 12*62 divsteps). */
197-
VERIFY_CHECK(eta >= -745 && eta <= 745);
198+
/* Bounds on zeta that follow from the bounds on iteration count (max 10*59 divsteps). */
199+
VERIFY_CHECK(zeta >= -591 && zeta <= 591);
198200
}
199201
/* Return data in t and return value. */
200202
t->u = (int64_t)u;
@@ -204,12 +206,14 @@ static int64_t secp256k1_modinv64_divsteps_62(int64_t eta, uint64_t f0, uint64_t
204206
/* The determinant of t must be a power of two. This guarantees that multiplication with t
205207
* does not change the gcd of f and g, apart from adding a power-of-2 factor to it (which
206208
* will be divided out again). As each divstep's individual matrix has determinant 2, the
207-
* aggregate of 62 of them will have determinant 2^62. */
208-
VERIFY_CHECK((int128_t)t->u * t->r - (int128_t)t->v * t->q == ((int128_t)1) << 62);
209-
return eta;
209+
* aggregate of 59 of them will have determinant 2^59. Multiplying with the initial
210+
* 8*identity (which has determinant 2^6) means the overall outputs has determinant
211+
* 2^65. */
212+
VERIFY_CHECK((int128_t)t->u * t->r - (int128_t)t->v * t->q == ((int128_t)1) << 65);
213+
return zeta;
210214
}
211215

212-
/* Compute the transition matrix and eta for 62 divsteps (variable time).
216+
/* Compute the transition matrix and eta for 62 divsteps (variable time, eta=-delta).
213217
*
214218
* Input: eta: initial eta
215219
* f0: bottom limb of initial f
@@ -290,7 +294,7 @@ static int64_t secp256k1_modinv64_divsteps_62_var(int64_t eta, uint64_t f0, uint
290294
return eta;
291295
}
292296

293-
/* Compute (t/2^62) * [d, e] mod modulus, where t is a transition matrix for 62 divsteps.
297+
/* Compute (t/2^62) * [d, e] mod modulus, where t is a transition matrix scaled by 2^62.
294298
*
295299
* On input and output, d and e are in range (-2*modulus,modulus). All output limbs will be in range
296300
* (-2^62,2^62).
@@ -376,7 +380,7 @@ static void secp256k1_modinv64_update_de_62(secp256k1_modinv64_signed62 *d, secp
376380
#endif
377381
}
378382

379-
/* Compute (t/2^62) * [f, g], where t is a transition matrix for 62 divsteps.
383+
/* Compute (t/2^62) * [f, g], where t is a transition matrix scaled by 2^62.
380384
*
381385
* This implements the update_fg function from the explanation.
382386
*/
@@ -455,19 +459,19 @@ static void secp256k1_modinv64_update_fg_62_var(int len, secp256k1_modinv64_sign
455459

456460
/* Compute the inverse of x modulo modinfo->modulus, and replace x with it (constant time in x). */
457461
static void secp256k1_modinv64(secp256k1_modinv64_signed62 *x, const secp256k1_modinv64_modinfo *modinfo) {
458-
/* Start with d=0, e=1, f=modulus, g=x, eta=-1. */
462+
/* Start with d=0, e=1, f=modulus, g=x, zeta=-1. */
459463
secp256k1_modinv64_signed62 d = {{0, 0, 0, 0, 0}};
460464
secp256k1_modinv64_signed62 e = {{1, 0, 0, 0, 0}};
461465
secp256k1_modinv64_signed62 f = modinfo->modulus;
462466
secp256k1_modinv64_signed62 g = *x;
463467
int i;
464-
int64_t eta = -1;
468+
int64_t zeta = -1; /* zeta = -(delta+1/2); delta starts at 1/2. */
465469

466-
/* Do 12 iterations of 62 divsteps each = 744 divsteps. 724 suffices for 256-bit inputs. */
467-
for (i = 0; i < 12; ++i) {
468-
/* Compute transition matrix and new eta after 62 divsteps. */
470+
/* Do 10 iterations of 59 divsteps each = 590 divsteps. This suffices for 256-bit inputs. */
471+
for (i = 0; i < 10; ++i) {
472+
/* Compute transition matrix and new zeta after 59 divsteps. */
469473
secp256k1_modinv64_trans2x2 t;
470-
eta = secp256k1_modinv64_divsteps_62(eta, f.v[0], g.v[0], &t);
474+
zeta = secp256k1_modinv64_divsteps_59(zeta, f.v[0], g.v[0], &t);
471475
/* Update d,e using that transition matrix. */
472476
secp256k1_modinv64_update_de_62(&d, &e, &t, modinfo);
473477
/* Update f,g using that transition matrix. */
@@ -517,7 +521,7 @@ static void secp256k1_modinv64_var(secp256k1_modinv64_signed62 *x, const secp256
517521
int i = 0;
518522
#endif
519523
int j, len = 5;
520-
int64_t eta = -1;
524+
int64_t eta = -1; /* eta = -delta; delta is initially 1 */
521525
int64_t cond, fn, gn;
522526

523527
/* Do iterations of 62 divsteps each until g=0. */

0 commit comments

Comments
 (0)