divmod10() : a fast replacement for /10 and %10 (unsigned)

Hi robtillaart,

I wondered why the correction term (and therefore if-then-else) is needed. (This slows down performance.) I tried to find a precise approximation and came up with this:

// for in between 0x0 .. 0x7fffffff
void divmod10(uint32_t in, uint32_t &div, uint32_t &mod)
{
  uint32_t x=in>>1; // div = in/10 <==> div = ((in/2)/5) 
  uint32_t q=x;
  q = x + (q >> 8);
  q = x + (q >> 8);
  x++;
  q = x + (q >> 8);
  q = x + (q >> 8);
  q = 3*q;
  div = (q + (q >> 4)) >> 4;
  mod = in - ((div << 3) + (div << 1));
}

On my Duemilanove it's faster then the divu10() code from the book Hackers Delight you gave. Maybe because no if-the-else is needed or shift by 8 is good for the optimizer. But I have no insight into the resulting assembler code.

Its a somewhat "weird" solution, again.