Execution Time of Serial.print and Serial.println

I made a test of the execution time of Serial.print and Serial.println with this code and an Arduino Nano V3.
The code uses is this:

const int TestOutPin = PD6;
int cnt = 0;
int data = 123;

void setup() {
  pinMode(TestOutPin, OUTPUT);
  digitalWrite(TestOutPin,LOW);

  Serial.begin(19200);
  delay(100);
  Serial.println("start");
  cnt = 100;
  };

void loop() {
  if (cnt < 1000) {
    digitalWrite(TestOutPin,HIGH);
    Serial.print( cnt );           //Execution time of Serial print or println with chars or integers.
                                    //It is measured by oscilloscope on TestOutPin.
    digitalWrite(TestOutPin,LOW);}
  else {
    cnt = 2000; };
  cnt ++;
  delay(10);
  };

The time is measured using an oscilloscope on the test pin. The delay in the loop should slow down the transmission so the serial ring buffer used by Serial should not become full at any time. Therefore the time measured should be for calculations and the transfer of characters to the ring buffer only.

The results can be found here:

int data = 123;
int cnt value 100 to 999

Serial.print( data ); //execution time 27 us
Serial.println( data); //execution time 45 us
Serial.print("A"); //execution time 14 us
Serial.println("A") //execution time 30 us
Serial.print("ABC"); //execution time 27 us
Serial.println("ABC"); //execution time 42 us
Serial.print("\t"); //execution time 14 us
Serial.print( cnt ); //execution time 140 us;

Sometimes, the time is seen 6 us longer and it might be an interrupt inside.

For an application, I got some time critical code, and for debug purposes I like to occasionally output an integer to the serial monitor without too much execution time.

I just like this execution time to be faster and in particular for Serial.print( cnt ).

It seems that the conversion of an integer to a decimal ascii string representation cause a significant calculation time of about 110 us.

Is there some way to do that with less execution time?

Well, I actually found one work around using print of Hex instead of digital. Then the results are like this (old results included:

int data = 123;
int cnt value 256 to 999

Serial.print( data ); //execution time 27 us
Serial.println( data); //execution time 45 us
Serial.print("A"); //execution time 14 us
Serial.println("A"); //execution time 30 us
Serial.print("ABC"); //execution time 27 us
Serial.println("ABC"); //execution time 42 us
Serial.print("\t"); //execution time 14 us
Serial.print( cnt ); //execution time 140 us;
Serial.print( cnt, HEX )//execution time 36 us;
Serial.println( cnt, HEX )//execution time 54 us;

Occationally, the time is seen 6 us longer.

Note that the variable "data" results in 3 characters like the "ABC" and the numbers are quite the same.

Don't know how You draw the conclusion abot conversion time. It looks wrong to me. It's very easily done.

Serial.printing in loop.... know the size of the output buffer. Comapre with baudrate and never print more characters per second then the baudrate manages to transfer.
If You run a time critical task then boost the baudrate to 115200.

Thanks. I tried to change from 9600 baudrate to 19200, and it did not change this execution time.

All this measures is the time it takes to put the data in the serial buffer. The digitalWrite commands probably take longer.

You're also not accounting for the Timer0 interrupt running in the background which might be interrupting your timing window.

I'm guessing that's the one that got interrupted.

I have tried for most of the tests to make a 3 character output. I have seen, that the time do depend on how many characters that is send.

When you output a fixed decimal number, the compiler seems smart and remembers the decimal representation to next output. When the number changes, a conversion is needed every time.

It seems to me, that conversion to a decimal number actually can make a difference, and it includes some divisions and multiplications by 10. Therefore the conversion to Hex makes the time shorter.

That's about right. There are some benchmarks here: avr-libc: Benchmarks
that say ltoa() takes about 1600 cycles - 100us on a 16Mhz AVR.
(Note that Serial.print() promotes all of its numeric arguments to long)

You could presumably cut the time in about half if cnt is actually only a 16bit int, with something like:

  itoa(ctn, buffer, 10);
  Serial.print(buffer);

Every once in a while, discussions of "faster decimal conversion" come up, ie divmod10() : a fast replacement for /10 and %10 (unsigned) - taking into account that you're not doing an arbitrary division, but specifically a division-by-10. But they seem to fall into obscurity pretty quickly.

Thank you very much. Yes, the conversion seems significantly faster this way - only about 20 us. The function works with negative integers as well. The result summery is then:

int data = 123;
int cnt value 256 to 999

Serial.print( data ); //execution time 27 us
Serial.println( data); //execution time 45 us
Serial.print("A"); //execution time 14 us
Serial.println("A"); //execution time 30 us
Serial.print("ABC"); //execution time 27 us
Serial.println("ABC"); //execution time 42 us
Serial.print("\t"); //execution time 14 us
Serial.print( cnt ); //execution time 140 us;
Serial.print( cnt, HEX ); //execution time 36 us;
Serial.println( cnt, HEX ); //execution time 54 us;
itoa(cnt, buffer, 10); Serial.print( buffer ); //execution time 62 us

Occationally, the time is seen 6 us longer.

Conversion is one issue, but handling of the serial ring buffer is another.

When I compare printing of "ABC" with "ABCD", then the later cause about 6.5 us more execution time (27.5 us to 34 us). It just seems to me, that handling of the pointers of the ring buffer etc. should not cause that much extra execution time for one character extra.
How well are these print functions optimized?

Many years ago I did some coding of other kinds of microprocessors in assembler language, so it is my only reference to ask this question. I got no skills yet to go deeper into the code for these Serial functions.

@backflip

the (simplified) implementation of println() has two calls internally and ditto overhead.

println(char * str)
{
   print(str);
   print("\n");
}

Compare these three

Serial.print("ABC");
Serial.println("ABC");
Serial.print("ABC\n");

How well are these print functions optimized?

Well enough for many applications however there is definitely room for improvement.

(includes e.g. 64 bit printing and Scientific and Engineering format

divmod10 and some of its "family" can be found here

I have tried that now, and yes, Serial.print("ABC\n") is about 8.5 us faster than using println. I include an update of the measured execution times below.

It seems to me, that the code, that you reference is made to create the characters to be printed in a faster way. However, the code for handling the ring buffer and hardware serial communication have not been a subject yet.

I have tried to look at some of the code in libraries, but my knowledge of C++ is still lacking. For instance, it is still very hard for me to know what code lines like this do:

for (struct MultiTarget *t = targets; t < targets+tCount; ++t) {
  // the simple case is if we match, deal with that first.
  if (c == t->str[t->index]) {
    if (++t->index == t->len)
      return t - targets;

And here ere the updated results:

int data = 123;
int cnt; //values are 256 to 999
char buffer[8];

Serial.print( data ); //execution time 27 us
Serial.println( data); //execution time 45 us
Serial.print("A"); //execution time 14 us
Serial.println("A"); //execution time 30 us
Serial.print("ABC"); //execution time 27.5 us
Serial.print("ABC\n"); //execution time 34 us
Serial.print("ABCD"); //execution time 34 us
Serial.println("ABC"); //execution time 42.5 us
Serial.print("\t"); //execution time 14 us
Serial.print( cnt ); //execution time 140 us;
Serial.print( cnt, HEX )//execution time 36 us;
Serial.println( cnt, HEX )//execution time 54 us;
itoa(cnt, buffer, 10); Serial.print( buffer ); //execution time 62 us

Occationally, the time is seen 6 us longer.

In general, you should assume that none of the arduino core functions are “very well optimized.”
The emphasis is on usability and “good enough”, perhaps with some "modularity and portability" thrown in.

Serial printing is hardly ever expected to be particularly efficient.

6.5us for a single character does seem excessive. But... "iterate through the string calling the C++ virtual function write() for each character..." (Virtual functions are one of those things that might qualify as "unexpectedly expensive" on an AVR.)
Start looking at the code (both source and disassembly) if you want the details.

Thank you for your answer.

I have tried to look for the code, but I am not sure about how to do that. I found some code for println in a file called Print.cpp in some library on my PC. It might also be in a file called HardwareSerial, because I once found a way to change the buffer size there in the H file.

It IS a "maze of twisty passages." The Print code is in print.cpp, but it ends up calling the code in HardwareSerial to do the actual manipulation (at least for actual Serial ports. It'll be different if your chip has native USB, or for SoftwareSerial, or whatever. That's the flip side of the inefficiencies inherent in virtual functions - the same print code gets used regardless of the actual hardware.

So. "Serial" is defined in HardwareSerial.h:

   class HardwareSerial : public Stream
//   :
    virtual size_t write(uint8_t);  // hardware-dependent single byte write,
                                    // implemented in hardwareSerial.cpp

And it says that "Serial" inherits from "Stream." Stream.h has:

class Stream : public Print

It's your basic bytestream-oriented IO, except that it's mostly "input", and inherits the output stuff from "Print":

class Print     // no more inheritance!
//   :
    virtual size_t write(uint8_t) = 0;  // explicitly UNDEFINED write method
//   :
    size_t print(const char[]);   // function that writes C-style strings
} 
//   :

    size_t write(const char *str) {
      if (str == NULL) return 0;
      return write((const uint8_t *)str, strlen(str));
    }
//   :
    virtual size_t write(const uint8_t *buffer, size_t size);
//   :
size_t Print::print(const char str[])
{
  return write(str);
}

/* default implementation: may be overridden */
/*  (HardwareSerial COULD override this, but it doesn't) */
size_t Print::write(const uint8_t *buffer, size_t size)
{
  size_t n = 0;
  while (size--) {
    if (write(*buffer++)) n++;
    else break;
  }
  return n;
}

Thank you for your help. Now, I have got an impression on how this code is implemented.

I do not know sufficient about C++ and Arduino to start a project to replace these buffer procedures for my use. It might be a future project, when I got the skills. Perhaps if you use google, you may find another guy, that already did make an alternative.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.