String concatenation for Serial.print

Of course, floating point support for Serial.print() DID get added. In a relatively nonStandard way that is not as powerful as printf(), but is just about as big.

krupski:
#1: There is overhead associated with every and any call of a C function.

I would probably agree that 10 calls with 2 arguments takes longer than 1 call with 12 arguments. I'm not sure, because there is some increased overhead due to the varargs calling convention of sprintf.

Regardless, the measurements show that the piece-wise overhead is much, much less than the time used by the other techniques (String and sprintf). Most of it has to do with the run-time interpretation of the format string.

#2: Wrong - same reason as #1.

The numbers seem to show that piece-wise if most efficient. I don't know how to discuss something with you if you don't look at the objective numbers. If you don't understand the measurement sketches, just ask a question.

#3: This statement is misleading... /dev makes it sound as though serial data will be output from the ring buffer even if the CPU is busy elsewhere (for example stuck in a blocking delay(nnn) call or in the middle of parsing the format specifiers of another string).

Serial data will be output from the ring buffer even if the CPU is doing a delay or formatting another piece. That's how interrupts work. I marvel that you don't know this, even after I provided links to the source. Continuing to argue the point fits the definition of Willful Ignorance.

Now, I suppose I could write a simple sketch and show actual numbers

I can do this if you wish.......

Yes, please. This is the essence of forum discussion. If you disagree, you have to support the argument with something that everybody can reproduce. Subjective ranting does not nullify the objective measurements. Show us your actual numbers, and maybe we'll all start using sprintf.

westfw:
Of course, floating point support for Serial.print() DID get added. In a relatively nonStandard way that is not as powerful as printf(), but is just about as big.

What I really mean is floating point AND standard in/out/err streams... enabled or disabled by the user, via a checkbox in Preferences.

Another thing that you may or may not know is that there are TWO separate "modules" connected with AVR-GCC floating point support.

One handles printf, fprintf, etc... and the other handles scanf, fscanf, etc...

The second one (the scanf support) is a resource hog and (as far as I know) provides very little benefit to the programmer. The first one (printf) adds only about 1.5K to a sketch and is the one most people would probably use.

So, in my Preferences, I have two "floating point" checkboxes... one for printf (which I use all the time) and one for scanf (which I have never used so far).

-dev:
Yes, please [run the test sketches]. This is the essence of forum discussion. If you disagree, you have to support the argument with something that everybody can reproduce. Subjective ranting does not nullify the objective measurements. Show us your actual numbers, and maybe we'll all start using sprintf.

OK, not sure what I'm supposed to be seeing here, but this is the output of your first sketch (only change I made was to set the serial baud rate to 115200 'cause that's what I always use).

[b]The value is currently 0x57 units
274us
The value is currently 0x05 units
240us
The value is currently 0xB3 units
240us
The value is currently 0x60 units
242us
The value is currently 0x0E units
242us[/b]

Second test sketch results (at 115200 baud):

[b]The value is currently 0x57 units
324us
The value is currently 0x05 units
326us
The value is currently 0xB3 units
328us
The value is currently 0x61 units
326us
The value is currently 0x0F units
326us
The value is currently 0xBD units
326us
[/b]

Third test sketch:

[b]The value is currently 0x57 units
206us
The value is currently 0x05 units
206us
The value is currently 0xB3 units
208us
The value is currently 0x60 units
206us
The value is currently 0x0E units
210us
The value is currently 0xBC units
210us[/b]

Does this look right? I have no idea what I'm supposed to be seeing.......

Delta_G:
I didn't say "of" I said "with". The OP was asking about function overhead and efficiency. Does your version of code using printf instead of a chain of print calls result in more compact or more efficient code? Does it meet the requirements of the OP? Or does it just make for less typing which wasn't what he was asking for.

Oh I see. I guess I misunderstood your question initially.

Using printf requires setting up small (3 or 4 line) functions to read and write the device (such as Serial or LCD, etc...) then "connect" them to the standard input/output/error streams using the "fdevopen()" function.

Of course, this adds a little bit to both the sketch size (i.e. flash) and sram usage.

Is the resulting code "more compact"? Most probably not. Is the code "more efficient"? What does that mean? In order to answer that question, I would have to make a test sketch that printed the same thing using one method and the other method and compare execution speed, resource use, etc....

And, is this all there is to the concept of "efficiency"? What about the fact that I can write, debug and finish the code a lot quicker because I don't have to fight multiple print calls? IMHO, that also counts as "efficiency".

I didn't do this because I don't care one bit if my program takes 300 microseconds to run or 310 microseconds, nor do I care if the program ends up being 22K or 25K in size.

But I DO care about being able to write standard code and have it work the way that I expect it to... usually the first time... as opposed to using a whole bunch of "print" calls and then going back and editing tiny glitches (such as a missing space between a message and a variable display).

I don't understand why everyone is so concerned about microseconds and a few extra K of flash?

If the resulting sketch grew too large to fit then, yeah, fine cut some corners to make it fit. No objections from me.

But when the compiled program takes 22K and I've got almost 256K to load it into, I simply don't care about optimizing down to the last byte. In fact, I may use -O3 instead of -Os... just to drive everyone crazy! :slight_smile:

There is overhead associated with every and any call of a C function. Parameters have to be placed on the stack, as well as return address and other info. Calling a function 10 times with a small chunk of data is most certainly slower than calling the function with all the data at once

Maybe. Most avr-gcc function calls use registers for passing the arguments, up to the point where there are too many arguments to fit in the registers allocated to that purpose. So calling a function with three arguments several times may in fact be faster than calling a function with six arguments (JUST due to function-call overhead.) And then both Serial.print() and printf() (every stdio-based hack I've seen for Arduino) end up calling Serial.write() one byte at a time, anyway. But Serial.write() on AVR arduinos is "light weight" compared to a "real computer" where it might be an operating system call with hundreds of cycles of additional overhead. (but see also my previous message WRT USB and TCP.) So it gets really complicated trying to micro-optimize this sort of thing.

I don't understand why everyone is so concerned about microseconds and a few extra K of flash?

One problem is that avr-libc has highly optimized code that implement both stdio "streams" (which are not actually file-system based), and floating point, and floating point output (via __ftoa_engine; recently discussed in another thread.) So adding printf() adds maybe 1.5k, and adding the floating point version of printf maybe another 2k, and 4k on a chip with 32k of flash isn't really very painful. (but remember that the first Arduino only had 6k of flash...)
However, other processor architectures aren't as lucky; you don't really appreciate avr-libc until you're forced to use something else. newlib-nano (used on most 32bit chips) has something like 12k in the integer-only printf(), and 30k+ if you add floating point by using plain newlib instead (which also adds a lot of bloat to stdio/etc.) That may still be irrelevant on a Due with 512k of flash, but there are smaller ARMs...
So it's not just the actual behavior on AVR that causes printf() to be avoided, but all of it's "reputation" earned on other platforms...

(hmm: incorrect/useless call to printf() in _exit() · Issue #47 · arduino/ArduinoCore-sam · GitHub)

Any Arduino IDE developer could add these options to the newest IDE in less than a day

Adding options is a whole can of worms. Atmel Studio has options for this sort of thing; it's a bewildering score or so of panels with strange names that it carefully documents as being equivalent to incomprehensible gcc options :frowning:

I found it a pain to type multiple Serial.print()'s for debug output so came up with 'Sprint'.

It just does a real cut-down printf():

//
// Example of 'Sprint' - a cut-down printf()
// - saves time typing in debug output for Serial.print
// TonyWilk


//-----------------------------
// Serial.print helper function
// - a real cut-down printf()
//
void Sprint( char *fmt, ... )
{
  char c;
  va_list args;
  va_start( args, fmt );
  while( (c=*fmt) != 0 ){
    switch( c )
    {
    case '%':
      c= *(++fmt);
      switch( c )
      {
      case 'd': Serial.print( va_arg(args,int) ); break;
      case 'f': Serial.print( va_arg(args,double) ); break;
      case 'h': Serial.print( va_arg(args,int), HEX ); break;
      case 'c': Serial.print( (char)va_arg(args,int) ); break;
      case 's': Serial.print( va_arg(args, char *) ); break;
      default:  break;
      }
      break;
    case '\\':
      c= *(++fmt);
      if( c == 'n' )
        Serial.println();
      else
        Serial.print( c );
      break; 
    default:
      Serial.print( c );
      break;
    }
    ++fmt;
  }
  va_end( args );
}

void setup() {
  // put your setup code here, to run once:
  Serial.begin(19200);
  Serial.println("boot...");
  char *astring= "test string";

  Sprint("This is an example...\n");
  Sprint("int:%d, float:%f, hex:0x%h, char: '%c', string:\"%s\".\nThe end\n", 
          42, 123.45, 0xFACE, 'x', astring );
}

void loop() {
  // put your main code here, to run repeatedly:

}

Efficient? well, it's a lot easier to type in "int:%d, float:%f, hex:0x%h\n" than the equivalent in separate statements and it doesn't need yet another buffer (like printf() would).

Anyway, I find it handy.

Yours,
TonyWilk

P.S. dunno how portable this is across all Arduinos, I dimply remember something about the types used with va_arg causing me some bother. I run only run Arduino Pro Mini (AtMega328).

Regarding the question about the example sketches:

Does this look right? I have no idea what I'm supposed to be seeing.......

Yes, each sketch prints a few things, based on your original example sketch in reply #1. To get a "random" value for printing, it takes the lower 8 bits of the current micros clock:

    value = millis() & 0xFF;

Each sketch grabs the current micros clock value before the print:

    unsigned long start = micros();

and calculates the elapsed time after the print:

    micros() - start

It then prints that elapsed time so you can see how long the print formatting took. This time is not affected by the baud rate, because all characters are simply added to the Serial output buffer. Interrupts will eventually send them over the USB, where the Serial Monitor window will eventually read and display them on the PC.

The delay statement at the end gives the interrupts time to empty the output buffer, in the background.

To measure the RAM and program size, get the numbers from the IDE build window.
To measure the execution speed, upload and run each sketch. The execution time varies by a few TIMER0 ticks (±4us).

Regarding the piece-wise printing technique, I share your annoyance:

This saves me the headaches of using multiple "Serial.print" calls and makes for easier reading of the source code itself.

... and...

I found it a pain to type multiple Serial.print()'s

We have discussed this before. If your metric is lines of code, then instead of individual prints:

    Serial.print( "RAM string" );
    Serial.print( f );
    Serial.print( ',' );
    Serial.println( i, HEX );

...use the streaming operators (aka "stream insertion"):

    Serial << "RAM string " << f << ',' << _HEX(i) << endl;

(See correction note below.)

This single line of code can be used in place of printf, and does not use RAM buffers at all. In fact, it actually resolves to individual Serial.print calls through the magic of C++ templates. It has the same RAM, speed and program size performance as the piece-wise print technique. Here is the sketch for comparison:

#include <Streaming.h>

volatile int value;

char hexDigit( int v )
{
  v &= 0x0F; // just the lower 4 bits
  if (v < 10)
    return '0' + v;
  else
    return 'A' + (v-10);
}

void setup()
{
  Serial.begin( 9600 );
}

void loop () {
    value = millis() & 0xFF;

    unsigned long start = micros();

    Serial << F("The value is currently 0x") << hexDigit( value >> 4 ) << hexDigit( value ) << F(" units") << endl;

    Serial.print( micros() - start );
    Serial.println( F("us") );

    delay (1000); // print out voltage once a second
}

For more information, see

* A commonly-used streaming template for the Arduino, with similar discussion about multiple print statements.

* The wikipedia page about C++ operators has a footnote about the << operator also being used for I/O streams.

* The wikipedia page about C++ I/O has a section about formatting modifiers and manipulator (e.g., HEX and endl in the example above).

Notes:

* The Arduino streaming template does not implement all modifiers and manipulators, but it would not be difficult to fill in the blanks. You can also provide the missing bits locally, inside your sketch.

* The Arduino does not really implement the standard I/O stream (with good reason). The Print and Stream classes are poorly-partitioned versions of ostream, istream and/or iostream (further distractions here).

Sorry, I have a correction. This does not work:

    Serial << "RAM string " << f << ',' << HEX << i << endl;

It does in Cosa, but not in the standard Arduino core. With the Streaming library, you have to do this

    Serial << "RAM string " << f << ',' << _HEX(i) << endl;

Noted above.

westfw:
And then both Serial.print() and printf() (every stdio-based hack I've seen for Arduino) end up calling Serial.write() one byte at a time, anyway.

I'm not sure that I would consider using a standard AVR-GCC function (fdevopen) a "hack". Anyway, you are right, stdout and stderr do indeed call "xxx.write()" once for each character. To that I ask "so what?" Are we worrying about data transfer speed? If so, then why do most people use Serial.begin (9600)?

Getting back to printf and standard IO streams, it's equally easy to connect different devices to different streams. For example, you can connect stdin to serial and stdout/stderr to an LCD display... or even better connect input to serial, output to an LCD and error to a DIFFERENT LCD so that program text and errors display on different screens.

Look how ridiculously simple it is (Stdinout.cpp):

#include <Stdinout.h>

// connect stdin, stdout and stderr to same device
void STDINOUT::open (Print &iostr)
{
    open (iostr, iostr, iostr);
}

// connect stdin to input device, stdout and stderr to output device
void STDINOUT::open (Print &inpstr, Print &outstr)
{
    open (inpstr, outstr, outstr);
}

// connect each stream to it's own device
void STDINOUT::open (Print &inpstr, Print &outstr, Print &errstr)
{
    close();  // close any that may be open

    stdin = fdevopen (NULL, _getchar0);
    _stream_ptr0 = (Stream *) &inpstr;

    stdout = fdevopen (_putchar1, NULL);
    _stream_ptr1 = &outstr;

    stderr = fdevopen (_putchar2, NULL);
    _stream_ptr2 = &errstr;
}

// disconnect stdio from stream(s)
void STDINOUT::close (void)
{
    fclose (stdin);
    stdin = NULL;
    _stream_ptr0 = NULL;

    fclose (stdout);
    stdout = NULL;
    _stream_ptr1 = NULL;

    fclose (stderr);
    stderr = NULL;
    _stream_ptr2 = NULL;
}

// Function that fgetc, fread, scanf and related
// will use to read a char from stdin
int STDINOUT::_getchar0 (FILE *fp)
{
    while (! (_stream_ptr0->available()));  // wait until a character is available...
    return (_stream_ptr0->read());  // ...then grab it and return
}

// function that printf and related will use
// to write a char to stdout
int STDINOUT::_putchar1 (char c, FILE *fp)
{
    if (c == '\n') { // \n sends crlf
        stream_ptr1->write ((uint8_t) '\r'); // send C/R
    }

    stream_ptr1->write ((uint8_t) c); // send one character to device
    return 0;
}

// function that printf and related will use
// to write a char to stderr
int STDINOUT::_putchar2 (char c, FILE *fp)
{
    if (c == '\n') { // \n sends crlf
        stream_ptr2->write ((uint8_t) '\r'); // send C/R
    }

    stream_ptr2->write ((uint8_t) c); // send one character to device
    return 0;
}

STDINOUT STDIO; // Preinstantiate STDIO object

Stdinout.h:

#ifndef STD_IN_OUT_H
#define STD_IN_OUT_H

#include <Stream.h>

static Stream *_stream_ptr0 = NULL; // stdin stream pointer
static Print *_stream_ptr1 = NULL; // stdout stream pointer
static Print *_stream_ptr2 = NULL; // stderr stream pointer

class STDINOUT
{
    public:
        void open (Print &);
        void open (Print &, Print &);
        void open (Print &, Print &, Print &);
        void close (void);
    private:
        static int _getchar0 (FILE *); // char read for stdin
        static int _putchar1 (char, FILE *); // char write for stdout
        static int _putchar2 (char, FILE *); // char write for stderr
};

extern STDINOUT STDIO; // Expose STDIO object

#endif // #ifndef STD_IN_OUT_H

This simple code, or Serial.print on top of Serial.print on top of Serial.print... ad-nauseaum?

-dev:
Regarding the piece-wise printing technique, I share your annoyance:

I just did a test (not sure how valid it is).

Anyway, this sketch:

void loop (void)
{
    // nuthin
}

void setup (void)
{
    Serial.begin (115200);
    Serial.println ("Now is the time for all good men to come to the aid of the party");
}

compiled uses 3578 bytes of flash and 337 bytes of sram (which seems like a lot for a "nothing" program).....

Using Stdinout:

#include <Stdinout.h>
void loop (void)
{
    // nuthin
}

void setup (void)
{
    Serial.begin (115200);
    STDIO.open (Serial);
    printf ("Now is the time for all good men to come to the aid of the party\n");
}

compiled it uses 4668 bytes of flash and 358 bytes of sram.

Using printf takes 1090 more bytes of flash and 21 more bytes of sram.

Now, real floating point vs dtostrf:

void loop (void)
{
    // nuthin
}

void setup (void)
{
    char buf [16];
    double d;
    Serial.begin (115200);

    d = 123.456789;
    dtostrf (d, 8, 2, buf);
    Serial.println (buf);

    d = 12.3456789;
    dtostrf (d, 8, 2, buf);
    Serial.println (buf);

}

prints this:

[b]123.46
 12.35
[/b]

resources used: flash 5080 bytes, sram 273 bytes.

void loop (void)
{
    // nuthin
}

void setup (void)
{
    char buf [32];
    double d;
    Serial.begin (115200);

    d = 123.456789;
    sprintf (buf, "%8.2f", d);
    Serial.println (buf);

    d = 12.3456789;
    sprintf (buf, "%8.2f", d);
    Serial.println (buf);
}

prints this:

[b]123.46
 12.35
[/b]

resources used: flash: 6608 bytes, sram: 279 bytes

Difference: 1528 bytes more flash, 6 bytes more sram.

About 1K for printf and about 1.5K for real floating point. Really, I don't think that's bad (IMHO). :slight_smile:

Delta_G:
Not bad depending on what you're going for. Convenience or code size. I'd like to remind that the OP in this case was concerned with code size and efficiency. So your answer may be right in some scenarios but is a loser at the metric requested.

Yes, although 'code size and efficiency' is somewhat dependant on requirements; simple debug output is one thing, formatting several lines for an LCD is another.

e.g. I have a project which needs no text output, apart from debug, and is at the point where sprintf() won't fit, so I (obviously) like my mini-printf function because it avoids the overhead in both RAM/ROM of using sprintf(), agrees with my (unreasonable?) loathing of streaming operators, is identical in timing to multiple Serial.print()'s but has the overhead of the function itself (but thereafter is more ROM 'efficient' than multiple calls of Serial.print) which I live with for the convenience.

Now, if 'efficiency' included 'neatness', 'readability' or 'convenience'... there'd be more points of view than programmers :slight_smile:

Yours,
TonyWilk

Delta_G:
Point being that there's no right answer to the question of which is best. Only which is best for this particular situation or that one. And in this particular situation the OP asked about something particular. All the people going on and on like their solution is the only solution anyone would ever need are deluding themselves.

Completely agree with this.

Yours,
TonyWilk

TonyWilk:
Yes, although 'code size and efficiency' is somewhat dependant on requirements; simple debug output is one thing, formatting several lines for an LCD is another.

When I did 68HC11 assembler programming, I had good luck formatting and printing text on an LCD display simply by making a "virtual" LCD screen in ram, then placing characters and numbers where I wanted them, then called a simple block copy subroutine to copy the "virtual" LCD screen to the real one.

The 68HC11, BTW, only has 256 bytes of SRAM, 512 bytes of EEPROM and a 64K address space (shared by eeprom, sram and registers).

Now THAT'S a processor where you need to watch each and every byte!

Delta_G:
Not bad depending on what you're going for. Convenience or code size. I'd like to remind that the OP in this case was concerned with code size and efficiency.

Ah, but what is "efficiency"? Ease of writing code? Easy to read and debug source? Small, compact machine code? Tight, fast running loops?

"Efficiency" can mean many different things.

krupski:
Now THAT'S a processor where you need to watch each and every byte!

Getting off-topic now...
Ah yes, programmers these days think they're hard done by with Kilobytes of memory.

Many moons ago I worked for Texas Instruments and then National Semiconductor (now joined) on very early micros. The NatSemi COP400 series was a 4-bit processor with a huge 64 NIBBLES of RAM. You didn't even have a byte to watch.

Can't exactly say those were the good old days tho' :slight_smile:

Yours,
TonyWilk

Attiny13, which is sort-of supported by tinycore, has 1k flash and 64byes RAM.
Fortunately, it doesn't have. Serial port, either, so the choice between serial.print and printf is a bit moot.

I would have thought that the atmega48, with 4K flash, would have gotten more attention, but I guess the mega8 passed I in price