String concatenation for Serial.print

samtal: To efficiently print out mixed data, I convert all data items into one concatenated string and send the string to the serial port.

Please define 'efficient'.

sterretje: Please define 'efficient'.

My thoughts entirely.

Put the print statements in a suitably function and call it when required so that the main program looks neat and tidy and screen space is not "wasted" with a series of print statements.

With 'efficient' I mean efficient in the serial port. Instead of sending each line with a Serial.print command that has significant overhead, concatenating all data into one string and one print command cuts much of that overhead.

As to the last comment by Brattain Member, I did not relate to efficiency in the editor text. The program lines are in one printout function. Of course, the lines in my example could be written as one line, but I like it my way, for clarity.

To the main point: My question was related to large number mixed values concatenating and formatting. Saving into a buffer can be an option, but not a nice one, and I need to test it to make sure I can format each individual value differently.

Your combining them has more overhead than just printing them one after another. All print has to do is dump the data into a buffer. Where it is concatenated like you want. So you’re basically making it do that twice. Now which is really more efficient?

Saving into a buffer can be an option, but not a nice one

If you're not keeping it in some buffer then where are you concatenating it? I think you have a general misunderstanding of what's going on here. Perhaps you should describe what you actually want to see happen and let someone tell you what's the best and most efficient method to do it.

Please define 'efficient'.

Indeed.

There is no reason to build one giant string and then print it all out at once. There is no overhead associated with Serial.print that you can avoid by building one giant string. Printing each piece by itself is actually much more efficient:

* The Arduino can be printing the first part of the message while it formats the next parts.

* No additional RAM is needed to contain any part of the printed string. Only your variables will use RAM.

* The characters to be printed are not copied into and out of this extra buffer; they are given directly to the Serial port object.

* This avoids the additional processing required by String operations and avoids the numerous issues with String usage, especially long-term stability (read this).

Using sprintf also has disadvantages:

* You must be very careful that the destination char buffer has enough room. If you don't count right, the sprintf function will write past the end of the buffer, corrupting memory. I don't know why people don't recommend s[u]n[/u]printf to avoid this common problem.

* The sprintf function is really a mini "interpreter". It interprets the format string at run time and "executes" the various % formatting functions. This is much slower than calling

    Serial.print( v, HEX )

* Since the interpretation occurs at run-time, there is no compiler warning about trying to print an integer when you pass in a character (or any other mismatch between the % formatter and the argument you pass).

* This format interpreter code is fairly long, adding about 1000 bytes to the executable size. Because the format string can contain many different types, code for all types must be included in the executable. When you use the print functions for individual pieces, the linker eliminates all the functions for types that you don't need.

* Floating-point number are not supported by default.

For comparison, here are 3 short sketches that use each technique:

void setup()
{
  Serial.begin( 9600 );
}

String string;
volatile int value;  // a trick to make sure the optimizer doesn't cheat  :)

// this is the code that generates the string to print
void loop (void) {
    value = millis() & 0xFF;

    unsigned long start = micros();

    string = "The value is currently 0x";
    string += hexDigit( value >> 4 );
    string += hexDigit( value );
    string += " units\r\n";

    Serial.print( string );

    Serial.print( micros() - start );
    Serial.println( F("us") );

    delay (1000); // print out value once a second
}

char hexDigit( int v )
{
  v &= 0x0F; // just the lower 4 bits
  if (v < 10)
    return '0' + v;
  else
    return 'A' + (v-10);
}
void setup()
{
  Serial.begin( 9600 );
}

char buf [64]; // must be large enough for the whole string
volatile int value;

// this is the code that generates the string to print
void loop (void) {
    value = millis() & 0xFF;

    unsigned long start = micros();

    sprintf (buf, "The value is currently 0x%02X units\r\n", value);
    Serial.print (buf);

    Serial.print( micros() - start );
    Serial.println( F("us") );

    delay (1000); // print out value once a second
}
volatile int value;

char hexDigit( int v )
{
  v &= 0x0F; // just the lower 4 bits
  if (v < 10)
    return '0' + v;
  else
    return 'A' + (v-10);
}

void printValue( int v )
{
    Serial.print( F("The value is currently 0x") );
    Serial.write( hexDigit( value >> 4 ) );
    Serial.write( hexDigit( value ) );
    Serial.println( F(" units") );
}

void setup()
{
  Serial.begin( 9600 );
}

void loop () {
    value = millis() & 0xFF;

    unsigned long start = micros();

    printValue( value );

    Serial.print( micros() - start );
    Serial.println( F("us") );

    delay (1000); // print out voltage once a second
}

The String version takes 370us to execute and uses 4384 bytes of program space and ~320 bytes of RAM (270 + ~50 bytes on the heap). The sprintf version takes 560us to execute and uses 3868 bytes of program space and 300 bytes of RAM. The piece-wise version takes 320us to execute and uses 2498 bytes of program space and 202 bytes of RAM.

Your choice, of course. ;)

Cheers, /dev

As to the last comment by Brattain Member, I did not relate to efficiency in the editor text. The program lines are in one printout function. Of course, the lines in my example could be written as one line, but I like it my way, for clarity.

If you are referring to my post then I am not suggesting that you put all of the print statements in one line, but I would certainly avoid using Strings as you do as it would lead to memory fragmentation.

I am not convinced that there is any significant overhead in using Serial.print more than once. The baud rate of the Serial interface determines how fast each byte is sent and each byte is sent individually whether the data is all in one buffer, such as would be the case were sprintf() were used, whether it all in one String (or string) or whether several individual prints are done each with a different part of the data.

Adding to all that

At the moment that the software output buffer for the serial port is full, your code will stall / block. That will be the case when using several print statements as well as in your approach.

If you use Serial.availableForWrite() and check if there is space to send some characters, you can prevent that from happening. Send a small chunk, check if there is enough space for the next chunk and send that etc. If there is not enough space, do something else.

With your approach, you can't.

samtal: To the main point: My question was related to large number mixed values concatenating and formatting. Saving into a buffer can be an option, but not a nice one, and I need to test it to make sure I can format each individual value differently.

Why is using a buffer not a "nice option"? No matter how you print something, a temporary buffer is used, Note that if you declare a buffer in a function, it's ram usage exists only as long as the function runs. The memory is freed when the function is complete.

What we all should be using is C's standard "[b]printf[/b]" , but in keeping with the Arduino policy of not supporting essential functions in order to save half a dozen bytes, we are forced to either use a bunch of "[b]Serial.print (this)[/b]" and "[b]Serial.print (that)[/b]" functions, ad-nauseaum, in order to print a simple, single line of text on the terminal or [b]sprintf[/b] and a buffer.

And, because of the fear of using a few more bytes of memory by using standard in, standard out and standard error, burned into everyone's mind by people who don't have a clue what they are talking about, Arduino users won't even use a simple library that automatically provides stdin/out/err access claiming every reason from "uses a few more bytes" to "it blocks" to "it will stop the sun from shining" to "the IDE doesn't support it" (when, of course, the IDE doesn't support ANYTHING... AVR-GCC does and indeed AVR-GCC does support all C/C++ functions) and instead happily go on typing ridiculous stuff like this:

int volts = 120;

Serial.print ("Voltage: ");

if (volts < 10) {
    Serial.print (" "); // align columns
}

if (volts < 100) {
    Serial.print (" "); // align 100's place
}

Serial.print (volts);

Serial.print (" VDC");

Serial.print ('\r'); // goto...
Serial.print ('\n'); // ...new line because the Print library doesn't even
    // know how to translate a Unix newline into a CR/LF.

When they COULD just do this:

int volts = 120;
fprintf (stdout, "Voltage: %3d VDC\n", volts);

Don't know why... maybe there's some perverse pleasure in repeatedly ramming one's head into the wall.... ?

And, if the user doesn't want to install a simple library to make things 1000% easier, the next best method is to use a temporary buffer and "[b]sprintf[/b]" which is almost as good (but not AS good) as using printf directly.

While we're at it, the Arduino developers, in their wisdom(?) not only disable floating point print support without providing the option of using it [u]if desired[/u], they also espouse the use of "[b]dtostrf[/b]" which is complicated, not understood by a lot of people, doesn't fully support "printf style" formatting and requires the user to provide a temporary buffer for it (and the user is responsible for making sure the buffer is large enough).

Everyone shies away from floating point because it uses a few dozen more bytes of memory, but the fact that "dtostrf" also consumes program and ram space doesn't seem to bother anyone.

I feel sorry for any Arduino user who ends up programming for a living, because they will forever be hampered by all the convoluted or just plain wrong "facts" they learned from the "experts". Learning something new is tough enough without having to also UNLEARN the wrong stuff the "experts" taught them.

...and don't even get me started on the absurdity of worrying about a function "blocking" when the code is single tasking/ single threaded and running on a toy microcontroller with a quarter meg or less of memory (or, nuttiness in the other direction...) setting up a bunch of interrupt handlers to read the state of a switch or blink an LED on and off...

-dev: Indeed.

There is no reason to build one giant string and then print it all out at once. [1] There is no overhead associated with Serial.print that you can avoid by building one giant string. [2] Printing each piece by itself is actually much more efficient:

* [3] The Arduino can be printing the first part of the message while it formats the next parts.

I know that "boldly asserted is half proven", but I would love to see some proof (links, whatever) to back up those 3 assertions (because you are wrong on all 3 counts).

krupski: I know that "boldly asserted is half proven", but I would love to see some proof (links, whatever) to back up those 3 assertions (because you are wrong on all 3 counts).

Well, he did show three example programs using the different techniques, and the version with the giant string, and the one with sprintf both used about 50% more ram, 100% more flash, and executed more slowly. What sort of proof are you looking for?

krupski: I know that "boldly asserted is half proven", but I would love to see some proof (links, whatever) to back up those 3 assertions (because you are wrong on all 3 counts).

Asking for proof and claiming he's wrong... with no proof. -dev showed the code and results. What more proof do you want? What proof have you other than your general distaste for the Arduino community? Does your version with printf produce smaller or faster code? Prove it.

There are cases where pre-buffering print() output might be useful. AFAIK, neither Ethernet.print() nor USBSerial.print() does anything intelligent in terms of avoiding the "one message per print statement" problem, and you could potentially improve throughput quite a lot. Normal HardwareSerial.print() would gain very little, though; in additional to there being no smarts for "aggregating" small output requests, there also aren't any smarts for optimizing large requests.

Hi all, As the originator of this thread, I am overwhelmed with the magnitude of replies to a question I thought was a simple one. It looks almost like opening a Pandora's box. Nevertheless, reading through all the replies and discussions I think I have the answer to my question, namely that concatenating print values is not the best way to go with Arduino, or at least it will not yield any advantage.

I will keep trying and testing my specific app, but this issue is of secondary importance to me at this time. Thanks to all contributors who made me somewhat smarter......

AFAIK...

USBSerial (aka Serial on a Leonardo) ultimately sends the bytes one at a time, so there is no advantage.

If the OP were using a W5100 (they're not), and if TCP "fragmentation" were an issue, simply derive a class from EthernetClient. Basically, the selected answer in that link adds the same RAM buffer used by the sprintf approach (I would also override the other virtual write method). It would use extra RAM like the sprintf technique, but would not have the other sprintf disadvantages.

Normal HardwareSerial.print() would gain very little, though; in additional to there being no smarts for "aggregating" small output requests, there also aren't any smarts for optimizing large requests.

There is nothing to gain. HardwareSerial puts each byte into an output buffer, and that buffer is emptied by transmit interrupts (in the background). This "output stream pump" is primed by the first character, earlier in that method.

For skeptics, this how the Arduino can continue doing other work (in the foreground) while those characters are gradually pumped out.

@samtal, glad to help!

USBSerial (aka Serial on a Leonardo) ultimately sends the bytes one at a time, so there is no advantage.

It's not the Send8() calls that are a problem - they just copy data from the user arguments to USB buffer. But the ReleaseTX() call a few lines later, in conjunction with the block on USB_SendSpace() before the byte loop. This is what releases the data (in the USB buffer) so that the USB hardware can send it. But there are only two USB Buffers, so once you've queued them to USB hardware to send, the code has to sit around and wait for the USB transaction(s) to complete. Because of the way USB works, this is relatively slow (~1ms) I've attached a test program that does 5000 individual single-byte Serial.print() calls or 100 50-byte Serial.print() calls, and times how long it takes. For an Nano (with HardwareSerial, at 115200bps) the times are nearly identical - 420 vs 425 ms (about what you'd expect. 11520/5000 = .434) On a Leonardo (using USB Serial), you get 135 vs 15ms. The "big" prints result in much better performance. (now those actual numbers on Leonardo don't quite fit my "1 message per ms" explanation, so that wasn't quite right. However...)

void setup() {
  Serial.begin(115200);
  while (!Serial)
    ;
}

void loop() {
  uint32_t startt, endt;

  delay(1000);
  startt = millis();
  for (int i = 0; i < 5000; i++) {
    Serial.print(" ");
  }
  endt = millis();
  Serial.println();
  Serial.print("Write 5000 individual bytes in ");
  Serial.print(endt - startt);
  Serial.println(" milliseconds. \n");
  Serial.flush();

  startt = millis();
  for (int i = 0; i < 100; i++) {
    Serial.print("                      25                       50\r");
  }
  Serial.flush();

  endt = millis();
  Serial.println();
  Serial.print("Write 100 50 byte chunks in ");
  Serial.print(endt - startt);
  Serial.println(" milliseconds. \n");
  Serial.flush();

  while (Serial.read() < 0)
    ;
}

Interesting! Some advantage, then. Thanks for tracing the rest of the call stack. :)

Makes me wonder why they didn't use a ring buffer like everything else. But if I had a nickel for every time I wondered that...

packetizing interactive byte-stream IO is a relatively complex problem; ripe for many tradeoffs...

Delta_G: Asking for proof and claiming he's wrong... with no proof. -dev showed the code and results. What more proof do you want? What proof have you other than your general distaste for the Arduino community? Does your version with printf produce smaller or faster code? Prove it.

I have no "distaste" for the Arduino community. Why on earth would I get on this forum and spend 95% of my time here answering questions or explaining how something works?

As far as my "assertions", let me explain each one:

(from -dev): There is no reason to build one giant string and then print it all out at once. [1] There is no overhead associated with Serial.print that you can avoid by building one giant string. [2] Printing each piece by itself is actually much more efficient:

  • [3] The Arduino can be printing the first part of the message while it formats the next parts.

1: There is overhead associated with every and any call of a C function. Parameters have to be placed on the stack, as well as return address and other info. Calling a function 10 times with a small chunk of data is most certainly slower than calling the function with all the data at once (because you avoid the overhead of calling the function 9 times more than necessary).

2: Wrong - same reason as #1.

3: This statement is misleading. Although the Arduino hardware serial code does use a ring buffer and interrupts, the statement made by -dev makes it sound as though serial data will be output from the ring buffer even if the CPU is busy elsewhere (for example stuck in a blocking delay(nnn) call or in the middle of parsing the format specifiers of another string).

Now, I suppose I could write a simple sketch and show actual numbers, but I felt that there was no need since everyone knows (I assume) that calling a C function involves "behind the scenes" activity which obviously consumes more time if the function is called repeatedly.

I can do this if you wish.......

Delta_G: Does your version with printf produce smaller or faster code? Prove it.

Forgot to address this part.

There is no such thing as "my version" of printf. I simply enable the use of existing functionality that the AVR-GCC compiler provides.

This saves me the headaches of using multiple "Serial.print" calls and makes for easier reading of the source code itself.

My IDE also has a checkbox which allows me at edit/compile time to enable or disable floating point support. Yes I know that it uses some resources, but when the sketch is compiled and floating point only adds 1.5K to my 24K sketch and it all fits with room to spare in an UNO or a MEGA, why should I beat my head against the wall to use "dtostrf" and all the extra work that involves?

Now, I don't know if enabling native floating point uses less or more resources than dtostrf and it's required buffer, but the difference can't be all that much and since I have room to spare anyway, why not?

My big gripe is with the DEVELOPERS of the Arduino IDE system. Sure, I understand that they want to conserve resources... I get it. But why not place an OPTION in Preferences to enable or disable certain features?

Want floating point? Just tick the checkbox. Want to use printf? Just tick the checkbox. Need every last bit of memory? Un-tick the checkbox.

But, GIVE THE USERS THE CHOICE AND CONTROL!!!

We can turn line numbering on or off, we can choose to auto-rename a sketch from .PDE to .INO, we can do a lot of other [sarcasm] really important [/sarcasm] things in Preferences.

But, control the things that EVERYONE HERE asks about? Nope, can't do it. WHY?

If you and I had a dollar for every time someone asked why trying to print a number only results in a question mark or how to get dtostrf working, we could afford to have others write our code! :)

(although what fun would that be?)