Generate better start values for nth-root by czurnieden · Pull Request #598 · libtom/libtommath

czurnieden · 2026-04-24T00:31:53Z

This is the implementation I talked about in DCIT/perl-CryptX#123 with @danaj
The runtime is up to standard now.
Needs some work (there is a TODO list at the top of mp_root_n.c, but mostly chores).

danaj · 2026-04-24T04:34:59Z

I'm getting a failure with a = 2147483645^23 b = 23. The start value t2 = 2147483634 is too low.
[MacOS 26, M1, 64-bit, gcc-15]

czurnieden · 2026-04-24T13:44:14Z

I had the angst-allowance at the wrong place. Should be added to the log(a).
Worst thing: had it right before ;-)

The number 2147483645^23 is well chosen: log2(2147483645^23) ~ 712.99999995364 so we got 712.99999995364/23 ~ 30.99999999798, which gets got caught in all of the rounding errors. Adding 1 (one) to o the log gives 713.99999995364/23 ~ 31.043478258852, resulting in a roughly 10% higher start-value in theory and is also larger than the actual root. 1/2should do it, too

I also found an nasty little bug if root^base == input, see #599

Thanks for taking a look!

danaj · 2026-04-24T15:53:38Z

9999999999900000000000000 ^ 1/53 is coming into the loop with t2 = 2. We used to have the comment "Start value must be larger than root" and this isn't true since 2 is the result we're looking for. It doesn't seem to be working right, but with t2=3 it's fine. Sorry for not triaging past that.

czurnieden · 2026-04-24T16:11:51Z

@danaj

Sorry for not triaging past that

Nah, no reason, I wrote most of it, so it's all on me ;-)

That seems to have been fixed by #599. Will put it in this PR now.

czurnieden · 2026-04-24T22:08:21Z

Should work now. (famous last words)
Which leads to the question: worth it?

- Implementation of a fixed point exp2 function to get better start-values - fixed treatment of perfect powers - fixed 0^(1/x) == 0 with x != 0 - added test for b == 0 (division by zero)

czurnieden · 2026-04-25T20:37:58Z

Could it be that I was a little bit behind? ;-)

czurnieden · 2026-05-22T00:22:13Z

This branch was a bit of a quick hack. Seemed as if the fixed point exp2 function was not up to par. Fixed that with a thoroughly tested new one.

About 8 decimal digits over the whole domain [0,1[ , even near one.

You don't not have to believe me, you can try for yourself, put some code below for your convenience.

A problem^Wnuisance is the variable precision of s_mp_fp_log which is the reason for some oddities in the actual code. Advantage: it is a Q0.32 using uint32_t which is always available since 8-bit is not supported anymore.

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

#define MP_FRACBITSQ032 (sizeof(uint32_t) *(size_t)CHAR_BIT)

static uint32_t mulhi_u32(uint32_t a, uint32_t b)
{
   uint32_t a_high, a_low, b_high, b_low, r_high, r_mida, r_midb, r_low, carry;
   /* We have uint64_t */
#if ( MP_DIGIT_BIT > 16)
   return (((uint32_t)((uint64_t)a*b)) >> MP_FRACBITSQ032);
#else
   a_high = a >> 16;
   a_low = a & 0xFFFF;
   b_high = b >> 16;
   b_low = b & 0xFFFF;

   r_high = a_high * b_high;
   r_mida = a_high * b_low;
   r_midb = a_low * b_high;
   r_low = a_low * b_low;

   carry = ((r_mida & 0xFFFF) + (r_midb & 0xFFFF) + (r_low >> 16)) >> 16;
   r_low = r_high + (r_mida >> 16) + (r_midb >> 16) + carry;
   return r_low;
 #endif
}


static uint32_t fixed_exp2_u32(uint32_t x)
{
   /* Remez coefficients for 2^x - 1 over [0, 1[ scaled to Q32 */
   uint32_t r;
   if (x == 0u) {
      return 0u;
   }

   r = 0xe5867;
   r = mulhi_u32(r, x) + 0x512969;
   r = mulhi_u32(r, x) + 0x27ab6f0;
   r = mulhi_u32(r, x) + 0xe33f3a6;
   r = mulhi_u32(r, x) + 0x3d7fbfd4;
   r = mulhi_u32(r, x) + 0xb17213ac;
   r = mulhi_u32(r, x) + 0xb;
   return r;
}


static double f2f(uint32_t a)
{
   /* 2^31 for Q1.31 4294967296*/
   return a / 4294967296.;
}

#include <math.h>
int main(void)
{
   double a, res, ref, aerr, rerr, maxabserr = 0.0, maxrelerr = 0.0;
   uint32_t x, start, end;

   start = 0x00000000;
   end   = 0xFFFFFFFF;
   fprintf(stderr,"Testing fixed_exp2 with inputs in [%.20f, %.20f)\n",
           f2f(start), f2f(end));

   printf("# input                abs. err.              rel. err.\n");
#ifdef FORGNUPLOT
   for (x = start; x < end; x+=0xffff) {
#else
   for (x = start; x < end; x++) {
#endif
      a = f2f(x);
      ref = exp2(a);
      res = f2f(fixed_exp2_u32(x)) + 1.0;
      aerr =  res - ref;
      rerr = fabs(aerr)/ref;
#ifdef FULLOUTPUT
      printf("%.15f %.15f  %.15f %.15f %.15f\n", a, res, ref, aerr, rerr);
#elif ((defined ERRORSONLY) || (defined FORGNUPLOT))
      printf("%.15f  %.15f %.15f\n", a, aerr, rerr);
#endif
      if (fabs(aerr) > fabs(maxabserr)) {
         maxabserr = aerr;
      }
      if (rerr > maxrelerr) {
         maxrelerr = rerr;
      }
   }
   fprintf(stderr, "max. abs. err = %g, max. rel. err. %g\n", maxabserr, maxrelerr);
   return EXIT_SUCCESS;
}

czurnieden added the feedback required label Apr 24, 2026

czurnieden force-pushed the faster_nroot branch from c4a2ee1 to 02af6fb Compare April 25, 2026 20:02

Refinement of and bugfixes in mp_root_n:

2a5d8ed

- Implementation of a fixed point exp2 function to get better start-values - fixed treatment of perfect powers - fixed 0^(1/x) == 0 with x != 0 - added test for b == 0 (division by zero)

czurnieden force-pushed the faster_nroot branch from 02af6fb to 2a5d8ed Compare April 25, 2026 20:27

Fixed rebase

a34e771

Christoph Zurnieden added 2 commits May 22, 2026 00:39

replaced exp2 function with a working one

da442d9

Cleaned up

8efa54a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate better start values for nth-root#598

Generate better start values for nth-root#598
czurnieden wants to merge 4 commits into
libtom:developfrom
czurnieden:faster_nroot

czurnieden commented Apr 24, 2026

Uh oh!

danaj commented Apr 24, 2026 •

edited

Loading

Uh oh!

czurnieden commented Apr 24, 2026

Uh oh!

danaj commented Apr 24, 2026

Uh oh!

czurnieden commented Apr 24, 2026

Uh oh!

czurnieden commented Apr 24, 2026

Uh oh!

czurnieden commented Apr 25, 2026

Uh oh!

czurnieden commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

czurnieden commented Apr 24, 2026

Uh oh!

danaj commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

czurnieden commented Apr 24, 2026

Uh oh!

danaj commented Apr 24, 2026

Uh oh!

czurnieden commented Apr 24, 2026

Uh oh!

czurnieden commented Apr 24, 2026

Uh oh!

czurnieden commented Apr 25, 2026

Uh oh!

czurnieden commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

danaj commented Apr 24, 2026 •

edited

Loading