Suggested replacement for Print::printNumber()

The following function is smaller and faster than the current version of Print::printNumber(). It uses one more byte of RAM buffer, 33 bytes vs 32 bytes for the current version. It uses about 100 fewer bytes of flash.

It is faster since it replaces the mod operator with a multiply operator. It uses one write for the entire formatted string. This speeds up the SD library a great deal.

I have achieved an SD formatted write rate of over 18,000 bytes/second for println with four digit numbers. I replaced FILE_WRITE with (O_WRITE | O_CREAT) in the SD.open() call.

The current SD library and Print only achieve 77 bytes/second when files are opened with FILE_WRITE. The current Print/SD library achieves about half the above rate when (O_WRITE | O_CREAT) is used in the open call.

Here is the suggested replacement:

void Print::printNumber(unsigned long n, uint8_t base) {
  char buf[8 * sizeof(long) + 1]; // Assumes 8-bit chars.
  char *str = &buf[sizeof(buf) - 1];

  *str = '\0';

  do {
    uint32_t m = n;
    n /= base;
    char c = m - base * n;
    *--str = c < 10 ? c + '0' : c + 'A' - 10;
  } while(n);

  write(str);
}

Again an interesting post;

Think the compiler optimizes the 'A' - 10; to '7' :slight_smile:

but there is still an optimization to be found in
char buf[8 * sizeof(long) + 1]; // Assumes 8-bit chars.
The array-size could be base dependant, NB this size is only needed when base = 2

base = 2 => 32 + 1 char.
base = 10 => 10 + 1 char.

The compiler does optimize 'A' - 10.

I used a buf size of 33 since base 2 is defined as the symbol BIN in Print.h.

The biggest problem for the current function and this function is a call with base = 1. They both write over memory. This function is never called with base = 0.

Maybe the end condition should be while (n && str != buf). This would produce 32 zero characters for this error. A base greater than 36 doesn't make sense but just produces a strange string.

See also - http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1293042719/2

Ah, now I see why you understand.

I think the base 1 case just needs to avoid crashing and give an indication of a bug.

Printing the string "00000000000000000000000000" is a big clue.

On compiler optimization. You need to use large sketches to see if a change is worth doing. The compiler does a pretty global optimization of a class. Here are some examples for Print.

I used a large sketch with a lot of print/println calls to estimate the 100 byte savings for the new printNumber(). The results were old function 7492 bytes new function 7388 bytes. This was consistent with several other large sketches.

Things are not so clear with a very simple sketch. For this sketch:

void setup() {
 Serial.begin(9600);
 Serial.print(1234567UL);
}
void loop() {}

You get old 1846, new 1696 for a savings of 150 bytes.

If you change the "UL" to "L"

void setup() {
 Serial.begin(9600);
 Serial.print(1234567L);
}
void loop() {}

You get old 1816, new 1774 for a savings of 42 bytes.

I have not dumped the assembly listing to see why.

What I have learned is that simple examples are dangerous for design decisions.

I think the base 1 case just needs to avoid crashing and give an indication of a bug.

So a small check preventing invalid bases (as proposed in the other thread) makes sense. The error message is TBD.

if ((base <2) || (base > 36))   // base can be too big too
{
  print("Base Error: );
  print(base);
  return;
}

How often is the base a variable instead of constant? Have you run across any situations where the base is a variable?

How often is the base a variable instead of constant? Have you run across any situations where the base is a variable?

Everytime printNumber is called, the snippet I posted was intended to be included in printNumber, I should have made that more explicit. :slight_smile:

void Print::printNumber(unsigned long n, uint8_t base) {

  if ((base <2) || (base > 36))   // base to small or to big 
  {
    write("BaseBug");
    return;
  }

  char buf[8 * sizeof(long) + 1]; // Assumes 8-bit chars.
  char *str = &buf[sizeof(buf) - 1];

  *str = '\0';

  do {
    uint32_t m = n;
    n /= base;
    char c = m - base * n;
    *--str = c < 10 ? c + '0' : c + 'A' - 10;
  } while(n);

  write(str);
}

I'll phrase my question differently...

Is printNumber ususally called this way...

  printNumber( value, 10 );

...or this way...

  uint8_t base;

  base = 10;

  printNumber( value, base );

...or, is it an internal function that's called by something else (like a print method)?

Print is a core class of Arduino and every class derived from it uses this code. printNumber is a private member of this class, so it will not be called directly from usercode.

To be found at ~\arduino-0021\hardware\arduino\cores\arduino

 printNumber( value, 10 );
 uint8_t base = 10;
 printNumber( value, base );

Both calls will lead to an internal var within printnumber called base that can be tested.

Got it. Thanks.

    n /= base;
    char c = m - base * n;

Grr. Pisses me off that the division routine almost certainly already computed the remainder, and then threw it away, requiring us to compute it again!

I have not dumped the assembly listing to see why.

I ran into this recently. It's because signed divide is implemented as a sign fixup function that calls unsigned divide. If your sketch doesn't use the signed divide anywhere else, your binary will be left with only the unsigned divide function (additionally saving the size of the sign-fixup wrapper.)

Grr. Pisses me off that the division routine almost certainly already computed the remainder, and then threw it away, requiring us to compute it again!

maybe the compiler optimizes it?

#include <stdlib.h>

void setup() {
}

void loop() {
  unsigned long a, b, q, r;
  a= 12345;
  b = 1000;
 ldiv_t d = ldiv(a, b);
  q = d.quot;
  r = d.rem;
}