Simple, useful additions to HardwareSerial

Hi all,

I've seen many posts from people asking how to get data from the serial port and/or get numeric data from the serial port.

The following is some simple code that you can add to your existing HardwareSerial library to provide this functionality:

(add this to the end of HardwareSerial.cpp)

char *HardwareSerial::readline (const char *prompt)
{
	int c, len;
	char *buf;

	len = 0;
	buf = NULL;

	this->print (prompt); // print user prompt if any

	while (1) { // continuous until end of line

		while (! (this->available()));

		c = this->read();

		buf = (char *) realloc (buf, ((len + 1) * sizeof (char))); // allocate a buffer byte

		switch (c) {

			case 0x0D:
			case 0x0A: { // CR or LF

				if (len) { // if there is something in the buffer...
					buf[len] = 0; // mark EOL

				} else {
					free (buf); // empty buffer - return null
					buf = NULL;
				}

				return buf; // return
			}

			case 0x08:
			case 0x7F: { // backspace

				if (len > 0) { // if we CAN backspace....
					len--; // move pointer back
					buf[len] = 0; // invalidate char
					this->write ((uint8_t) 0x08); // back up 1
					this->write ((uint8_t) 0x20); // blank out char
					this->write ((uint8_t) 0x08); // back up again
				}

				break;
			}

			default: { // printable characters
				if (isprint (c)) {
					buf[len] = c; // put char into buffer
					len++; // prepare for another char
					this->write ((uint8_t) c); // echo char
				}

				break;
			}
		}
	}
}

long int HardwareSerial::parseInt (const char *arg)
{
	size_t n;
	size_t len = strlen (arg);

	for (n = 0; n < len; n++) {
		// allow decimal point
		if ((isdigit (arg[n])) || (arg[n] == '.')) {
			break;
		}
	}

	if (strncmp ((arg + n), "0x", 2) == 0) {
		// parse as hex if it's "0x"
		return strtol ((arg + n), NULL, 16);

	} else {
		// else it must be decimal
		return strtol ((arg + n), NULL, 10);
	}
}

double HardwareSerial::parseFloat (const char *arg)
{
	size_t n;
	size_t len = strlen (arg);

	for (n = 0; n < len; n++) {
		// allow decimal point
		if ((isdigit (arg[n])) || (arg[n] == '.')) {
			break;
		}
	}

	return strtod ((arg + n), NULL);
}

(add this to HardwareSerial.h at the bottom of the class definitions [inside the public area])

		char *readline (const char *);
		long int parseInt (const char *);
		double parseFloat (const char *);

Lastly, here's a sketch that shows how it all works:

void setup (void)
{
        Serial.begin (115200); // set your baud rate
}

void loop (void)
{
        char *buf;
        long int i;
        double d;

	buf = Serial.readline ("\nType something: ");

	if (buf) { // must test for non-NULL because blank line returns NULL
		Serial.print ("\nYou typed: '");
		Serial.print (buf);
		Serial.print ("'\n");
		i = Serial.parseInt (buf);
		d = Serial.parseFloat (buf);
		Serial.print ("It's int value is: ");
		Serial.print (i);
		Serial.print ("\n");
		Serial.print ("It's float value is: ");
		Serial.print (d);
		Serial.print ("\n");
		free (buf); // must free buffer after each use (only if not NULL)
	}
}

Note that with parseInt, you can prefix the string with "0x" and it will be parsed as HEX.

Typical output of the test sketch:

[b]

Type something: this is a test
You typed: 'this is a test'
It's int value is: 0
It's float value is: 0.00

Type something: .4
You typed: '.4'
It's int value is: 0
It's float value is: 0.40

Type something: 0.45
You typed: '0.45'
It's int value is: 0
It's float value is: 0.45

Type something: 0x1000
You typed: '0x1000'
It's int value is: 4096
It's float value is: 0.00[/b]

Hope this is of some use to you.....

I don't like the parseInt() and parseFloat() functions because they block.

...R

Robin2:
I don't like the parseInt() and parseFloat() functions because they block.

...R

parseint and parsefloat do not block.... readline does (and of course, it HAS to if you think about it).

Sorry. RTFM

Parsing stops when no characters have been read for a configurable time-out value, or a non-digit is read;

...R

Robin2:
I don't like the parseInt() and parseFloat() functions because they block.

...R

And this isn't really a bad thing at all. If you are going out of your way to implement a non-blocking solution, when your application cannot do anything until the input has been received, you have simply re-invented the wheel... Basically your non-blocking code has implemented the blocking method at a higher level.

This argument is only worthy if you actually have something to do while waiting.

Embrace OOP! Rather than modifying a standard library (BAD idea!), which will break when you update the IDE, create a new class that inherits from HardwareSerial, and simply add your new functionality to that. Put the new class in your local library directory, and it will automatically work on updated IDEs.

Regards,
Ray L.

buf = (char *) realloc (buf, ((len + 1) * sizeof (char)));

If you MUST use malloc()/etc, PLEASE don't grow your buffer one byte at a time! That's particularly inefficient and prone to working poorly (causing fragmentation/etc) in limited-RAM environments.

create a new class that inherits from HardwareSerial

I've been wondering about this. It sounds like a great idea, but ... somtimes "Serial" is a HardwareSerial object, sometimes it is a UARTClass, sometimes it's a USARTClass, and sometimes it's a USB thing. Can a write a derived class (or a subroutine that takes "SerialN" as an argument) that handles all that? Is this an example of bad design in the Arduino Software? (Should the hardware be something that stream eventually inherits FROM, instead of the other way around? I tried to figure out how that would work, and I just got confused.)

Here's one way to implement a (probably) non-blocking getline() function:

/*
 * parseGetline_nb
 * Read a line of text from Serial into the internal line buffer.
 * With echoing and editing!
 * Non-blocking.  Returns 0 until end-of-line seen.
 */
void parseReset()
{
    /*
     * Reset the line buffer
     */
    memset(linebuffer, 0, sizeof(linebuffer));
    inptr = 0;
    parseptr = 0;
    termchar = 0;
}

uint8_t parseGetline_nb(void)
{
    int c;

    c = Serial.read();
    switch (c) {
    case 127:
    case CTRL('H'):
	/*
	 * Destructive backspace: remove last character
	 */
	if (inptr > 0) {
	    Serial.print("\010 \010");
	    linebuffer[--inptr] = 0;
	}
    break;
    case CTRL('R'):
	/*
	 * Ctrl-R retypes the line
	 */
	Serial.print("\r\n");
	Serial.print(linebuffer);
	break;
    case CTRL('U'):
	/*
	 * Ctrl-U deletes the entire line and starts over.
	 */
	Serial.println("XXX");
	memset(linebuffer, 0, sizeof(linebuffer));
	inptr = 0;
	break;
    case CTRL('M'):
	Serial.println();			/* Echo newline too. */
	return inptr;
    default:
	/*
	 * Otherwise, echo the character and put it into the buffer
	 */
	linebuffer[inptr++] = c;
	Serial.write(c);
    case -1:
	return 0;
	break;
    }
}

pYro_65:
And this isn't really a bad thing at all...
This argument is only worthy if you actually have something to do while waiting.

Oooo... not really. It makes the sketch fragile when modified. Many, many users start with a blocking approach and cannot figure out how to "do something else".

Rather than encourage this "linear" approach to programming, I would suggest developing a non-blocking prompt/response example, as suggested by westfw.

Another way to look at it: Of the n ways to structure a program to accomplish a series of tasks, only 1 of them is the expected/desired/forced linear path. Although it is easy for a beginner to understand, there are n-1 other ways to do the same thing without blocking. All of those n-1 sketches will be easier to modify to do "other things".

Also, using malloc/realloc/free is a no-no. There is exactly 1 way to use it safely, and there is always another way to do the same thing. Unfortunately, there are n-1 ways that it will cause trouble. As has been discussed, it is very unlikely that a non-expert will get it right. Even then, it will be susceptible to later modifications that break the rules. (The OP's suggested methods are not safe.)

westfw:
I've been wondering about [Serial class design]. Is this an example of bad design in the Arduino Software? (Should the hardware be something that stream eventually inherits FROM, instead of the other way around? I tried to figure out how that would work, and I just got confused.)

Check out Cosa's faster, smaller, cleaner OOP design. Tasty! But it's not something the "typical" Arduino user could use. The author has recently added matching examples, like Blink, if you'd like to give it a whirl.

Cheers,
/dev

westfw:
If you MUST use malloc()/etc, PLEASE don't grow your buffer one byte at a time! That's particularly inefficient and prone to working poorly (causing fragmentation/etc) in limited-RAM environments.

What's wrong with malloc()? First of all, "efficiency" is of no importance here... given that malloc() is only called once per keystroke. I've been using this type of "allocate as needed" type of coding for many different things and have never run into any problems with it. Are you saying that AVR-GCC's implementation of malloc() is flawed?

The whole point for doing this is to use as little ram as possible. The alternative is to allocate a "best guess sized" buffer and hope that (A) it's big enough and (B) that it's not too big, thereby wasting ram.

I've been wondering about this. It sounds like a great idea, but ... somtimes "Serial" is a HardwareSerial object, sometimes it is a UARTClass, sometimes it's a USARTClass, and sometimes it's a USB thing. Can a write a derived class (or a subroutine that takes "SerialN" as an argument) that handles all that? Is this an example of bad design in the Arduino Software? (Should the hardware be something that stream eventually inherits FROM, instead of the other way around? I tried to figure out how that would work, and I just got confused.)

Yes, it's easy to write code that takes "Serial" as a parameter (in fact, my [u]Stdinout[/u] library does exactly that).

Don't know, however, how that would be used to provide additional "Serial.xxxx" functions.

Maybe better would be to write the extras separately and then use
** **#include** **
to pull them in.

The reason I put the code inside "HardwareSerial" and used "this->xxxx" was so that whatever Serial was used (such as Serial1 or Serial2, etc...) it would always work.

Lastly, I had to laugh... I bet myself how many posts it would be before someone mentioned "blocking code" and I was right....... :slight_smile:

/dev:
Also, using malloc/realloc/free is a no-no.

Oh? Please do tell. Why?

Haven't you just basically re-invented the String based methods of HardwareSerial? If you use malloc/realloc, I think you just as well can use HardwareSerial::readString().

What I don't like:
1)
you never check the result of the memory allocation for NULL; when using your code, it will show undefined behaviour when you run out of memory.
2)
it blocks
3)
when line termination is , it will return an empty line on the next readLine().
// edit: added number 4
4)
You never free buf when a or is received; with the given approach that is not possible because you have to return buf. You will be out-of-memory somewhere along the line.

The below code demonstrates (I have made readline a function in the sketch)

#include "MemoryFree.h"


char txBuffer[64];

void setup()
{
  Serial.begin(115200);

  Serial.print(F("freeMemory() before = "));
  Serial.println(freeMemory());
  readline();
  Serial.print(F("\nfreeMemory() after = "));
  Serial.println(freeMemory());
  readline();
  Serial.print(F("\nfreeMemory() after = "));
  Serial.println(freeMemory());
}

void loop()
{
  // put your main code here, to run repeatedly:

}

char *readline ()
{

  int c, len;
  char *buf;

  len = 0;
  buf = NULL;

  while (1) { // continuous until end of line

    while (! (Serial.available()));

    c = Serial.read();

    buf = (char *) realloc (buf, ((len + 1) * sizeof (char))); // allocate a buffer byte

    switch (c) {

      case 0x0D:
      case 0x0A: { // CR or LF

        if (len) { // if there is something in the buffer...
          buf[len] = 0; // mark EOL

        } else {
          free (buf); // empty buffer - return null
          buf = NULL;
        }

        return buf; // return
      }

      case 0x08:
      case 0x7F: { // backspace

        if (len > 0) { // if we CAN backspace....
          len--; // move pointer back
          buf[len] = 0; // invalidate char
          Serial.write ((uint8_t) 0x08); // back up 1
          Serial.write ((uint8_t) 0x20); // blank out char
          Serial.write ((uint8_t) 0x08); // back up again
        }

        break;
      }

      default: { // printable characters
        if (isprint (c)) {
          buf[len] = c; // put char into buffer
          len++; // prepare for another char
          Serial.write ((uint8_t) c); // echo char
        }

        break;
      }
    }
  }
}

See Arduino playground - available memory for the additional functions.

I used a 100 character input string in serial monitor (twice) and this is the result for the reported free memory

freeMemory() before = 1825
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
freeMemory() after = 1722
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
freeMemory() after = 1619

I think that buf in your code should be a static variable and you should free it in the beginning if it's not NULL.
// endedit

To end with a positive note :slight_smile: I like the catering for the backspace.

Thanks for sharing; always interesting to see other people's approaches to problems.

On a side note
I'm a strong believer in a layered model. Reading the serial port is a low level functionality and should simply give you a byte/int back (if available). The intermediate level is the reading of multiple bytes till a certain condition is satisfied. And the high level is the processing of the received bytes.

I am therefore highly in favour of an approach as e.g. shown in Robin2's 'Serial Input Basics - updated' thread.

pYro_65:
This argument is only worthy if you actually have something to do while waiting.

Quite true.

But I think many newbies don't understand the difference and IMHO it is easier to start them off with the non-blocking concept as it will do in every situation.

...R

Haven't you just basically re-invented the String based methods of HardwareSerial?

No; none of them do "line-oriented" input, or echoing, or editing, all of which are desirable for interaction with a user on a serial port. The existing methods are aimed at interacting with other devices.

You never free buf when a or is received; with the given approach that is not possible because you have to return buf.

Well of course not. It's the caller's responsibility to free the string when it is done with the line. In fact, the ability to read multiple lines without copying each one to a secondary buffer is an advantage of this scheme over a fixed-buffer version.

What's wrong with malloc()? First of all, "efficiency" is of no importance here... given that malloc() is only called once per keystroke.

Well, perhaps. It'd be more important if the function was non-blocking and other tasks needed CPU time. As is, it's only bothersome. Although it's also against some commercial embedded programming guidelines. And it's relatively big; using malloc() and realloc() adds abour 1k to a sketch.

Are you saying that AVR-GCC's implementation of malloc() is flawed?

Probably. I mean, it doesn't detect collision with the stack, right? (Hmm. It seems to try...) It DOES seem to have a lot less overhead than the last malloc() that I used. Combined with the single-threaded nature of things, I guess it's not too likely to be TOO bad. Has anyone ever explained why Strings seem to cause problems on AVR Arduinos (when used a lot like this), other than handwaving and mumbling about fragmentation and how malloc() is bad on embedded systems? I guess that repeated realloc() of the same structure in a tight loop is going to be better than C++ getting in there and creating a "new" string so it can do "S = S + nextchar;"
Worst case, a realloc-based "add a character to an N byte string" operation will take 2*(N+2)+1 bytes, traverse the freelist multiple times, use a few words of stack, and memcpy N bytes. While a fixed buffer implementation would take Z bytes (you pick; but at least you know ahead of time how your RAM usage is looking), but have none of the "extra" code overhead.

westfw:
Well of course not. It's the caller's responsibility to free the string when it is done with the line.

Point taken; I missed that the buffer was freed in the main sketch.

The whole point for doing this is to use as little ram as possible.

Then I'd save 1K by not using malloc, and by using FLASH memory for double-quoted string literals, with the F macro.

The alternative is to allocate a "best guess sized" buffer and hope that (A) it's big enough and (B) that it's not too big, thereby wasting ram.

You should always decide what is the limit. Otherwise, the user can keep typing characters until RAM is exhausted. By analyzing what is reasonable, you can add this to all the other choices you make about size, and decide if everything will fit. Setting limits makes your program predictable and deterministic, two crucial characteristics for a sketch that runs forever.

As far as "wasting RAM", if the buffer is on the stack (i.e., a local variable), nothing is wasted. The buffer "pops" off the stack when the function returns. You can even use a variable-sized array in the caller:

void getInt( uint8_t digits )
{
        char buf[digits+1];

 if (mySerial.readline ( F("\nType something: "), buf, digits+1 )) {
                long int i = atol( buf );

With this technique, malloc does not have to search for a block of RAM... it was right there on the stack. This assumes that blocking is ok, and that the prototype for readline is:

    bool DerivedSerial::readline
        ( const __FlashStringHelper *prompt, char *line, uint16_t size )

Honestly, you'd be better off defining a class that implements a Finite-State Machine parser, give it a Stream (the base class for everything that can read characters), and call it every time through loop:

char answer[ 20 ];
ReadlineFSM reader( &Serial, answer, sizeof(answer) );

void loop()
{
  if (timeToAsk)
    reader.prompt( F("\nType Something:" ) );

  if (reader.available()) {
    // Answer is ready!
    long value = atol( answer );
  }
}

I think that reads pretty cleanly, and most people would understand how to use it. The FSM class could be something like:

class ReadlineFSM
{
private:
  Stream   *stream;
  char     *buf;
  uint16_t  size;
  uint16_t  len;
  enum state_type { NOT_ASKING, ASKING } state;

public:
  ReadlineFSM( Stream *s, char *b, uint16_t sz )
    { stream = s;
      buf    = b;
      size   = sz;
      state  = NOT_ASKING
    }

  void prompt( const __FlashStringHelper *p )
    {
      stream->print( p );
      len   = 0;
      state = ASKING;
    }

  bool available();
};

bool ReadlineFSM::available()
{
  if (stream->available()) {
    char c = stream->read();

    switch (state) {
      case NOT_ASKING;
        break;

      case ASKING:

        switch (c) {
          case 0x0D:
          case 0x0A:
            buf[ len ] = '\0';
            state = NOT_ASKING;
            return true;

       ...
    }
  }
  return false;
}

When it handles certain characters, it returns special results. And if you really want to write a blocking sketch, you can still do it:

void loop()
{
  reader.prompt( F("\nType Something:" ) );

  while (!reader.available())
    ;

  long value = atol( answer );

}

Also, using malloc/realloc/free is a no-no."

Oh? Please do tell. Why?

Well, I hope you're open to learning about this subject! I have often said that its ease-of-use is in stark contrast to its complex behaviour over time. See next.

Has anyone ever explained why Strings seem to cause problems on AVR Arduinos (when used a lot like this), other than handwaving and mumbling about fragmentation and how malloc() is bad on embedded systems?

There is no hand-waving or mumbling here, just the facts. Several links discuss why it's generally bad to use it, but be sure to read "The Evils of Arduino Strings" -- it is focused on our environment.

It's the caller's responsibility to free the string when it is done with the line."

Point taken; I missed that the buffer was freed in the main sketch.

Another point: this opens the door to memory leaks. :frowning:

Cheers,
/dev

sterretje:
Haven't you just basically re-invented the String based methods of HardwareSerial? If you use malloc/realloc, I think you just as well can use HardwareSerial::readString().

What I don't like:
1)
you never check the result of the memory allocation for NULL; when using your code, it will show undefined behaviour when you run out of memory.
2)
it blocks
3)
when line termination is , it will return an empty line on the next readLine().
// edit: added number 4
4)
You never free buf when a or is received; with the given approach that is not possible because you have to return buf. You will be out-of-memory somewhere along the line.

readString() is all fine and dandy... if you can type fast. Else, it times out. Readline is meant to acquire USER INPUT.

(1) True, but a user entering a line would have to type several thousand characters in order to run out of memory. Not likely (although I do agree the test probably should be in there).

(2) It "blocks". So what? What else is a program going to do while it's waiting for user input?

(3) I'll have to check that. I expected that most people would be using Windows or Linux or Mac terminal programs (all of which send a CR when "enter" is pressed). Some implementations of a "terminal" (like using Screen in Linux as a terminal) send a LF as end of line. However, since the code will consider a CR or an LF as an end of line, a CR/LF would end the line (via the CR), then the LF would be waiting in the buffer and would immediately exit the readline next time around (this time via the LF). Very unlikely to run into this, but it should be handled for the sake of cleanliness.

(4) Of course not! If you looked at the demo sketch, the buffer is freed AFTER it's used. Freeing it upon exiting readline would return nothing to the caller.

You may or may not know that the standard GNU Readline library (used by Linux, the Bash shell and countless other programs) all return a pointer to the buffer containing the read line which must be freed after use by the user program.

sterretje:
I used a 100 character input string in serial monitor (twice) and this is the result for the reported free memory

freeMemory() before = 1825

1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
freeMemory() after = 1722
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
freeMemory() after = 1619

Here's what I get when checking free memory:

[b]

Type something: a
You typed: 'a'
It's int value is: 0
It's float value is: 0.00
Memory before free (buf): 7676 bytes
Memory after free (buf): 7680 bytes

Type something: ab
You typed: 'ab'
It's int value is: 0
It's float value is: 0.00
Memory before free (buf): 7675 bytes
Memory after free (buf): 7680 bytes

Type something: abc
You typed: 'abc'
It's int value is: 0
It's float value is: 0.00
Memory before free (buf): 7674 bytes
Memory after free (buf): 7680 bytes

(this one was a 355 byte line)
Type something: I'd like to thank people on this list for help and advice, as this was my first ever
You typed: 'I'd like to thank people on this list for help and advice, as this was my first ever pat'
It's int value is: 0
It's float value is: 0.00
Memory before free (buf): 7321 bytes
Memory after free (buf): 7680 bytes

Type something:[/b]

You have to free the buffer after you use it.

/dev:
Then I'd save 1K by not using malloc, and by using FLASH memory for double-quoted string literals, with the F macro.

What on earth are you talking about? You propose to place a dynamic buffer in flash?

Surely you jest.

/dev:
Another point: this opens the door to memory leaks. :frowning:

Cheers,
/dev

Memory leaks are caused by poor programming.

Krupski:
Memory leaks are caused by poor programming.

And heap fragmentation.

The latter of which is only solved by more code (when possible) or resetting.