Getting reliable Serial.read()'s without a stop character

Hey guys, I've been playing around with a command interpreter I recently wrote. I found that if I didn't add a delay when processing new Serial data, my new Serial data might come in two separate chunks, or more. As one person mentioned before, adding the delay is a "band-aid" solution to the problem. Here's a way I found that will prevent that reliably, though it may still be seen as a crutch to the program. I'd love some feedback on this. Thanks! Oh, and I hope the comments don't get in the way...they are supposed to be there to help, but who knows if they will or not.

void setup()
{
  //let's start by starting some serial communication
  Serial.begin(57600);
}

//then initialize some needed variables here...

//character array to store the command
char command[30];

//index for character array
int i = 0; 

//lets us know if a command is available to be processed or not
boolean commandAvailable = false; 

//how many times will the program check to see if serial is
//available from the buffer before flagging the command as read?
#define TRIES 50  

//a variable that will be used again and again for checking the above    
int tries = TRIES;

void loop()
{
  //check if serial is available and that we haven't exceeded the number of tries
  //and that we haven't gone past the size of the command
  if (Serial.available() > 0 && tries > 0 && i < sizeof(command))
  {
    command[i++] = Serial.read(); //store a byte in the array, then increment the index
    commandAvailable = true; //a command will eventually be ready...
    
    //we will keep checking for data at the buffer for <code>TRIES</code>
    //number of times until we are convinced there is no more data to be read at this time.
    tries = TRIES; 
  }
  else
  {
    //we either:
    //1. didn't find any data at the buffer, or 
    //2. we've exceeded the command size, or 
    //3. we have looked for data an appropriate number of times.
    //Thus, we need to decrement <code>tries</code> to say that we've looked
    //for data yet another time.
    tries--;
  }
  
  //if we have a command available and we have reached zero
  //indicating we have looked for data <code>TRIES</code> number
  //of times, we are ready to see the command.
  if (tries <= 0 && commandAvailable) 
  {
    //let's get rid of all that other data in case the buffer still has some left in it.
    Serial.flush(); 
    
    //print out the command we received
    Serial.println(command);
    
    //reset the index for the next time
    i = 0;
    
    //reset the number of tries for next time
    tries = TRIES;
    
    //confirm that we've checked the command, so mark this false
    commandAvailable = false;
    
    //write over the command space so we don't see previous characters
    //for a shorter command next time
    for (int i = 0; i < sizeof(command); i++)
    {
      command[i] = 0;
    }
  }
}
    //let's get rid of all that other data in case the buffer still has some left in it.
    Serial.flush();

Why?

    //write over the command space so we don't see previous characters
    //for a shorter command next time
    for (int i = 0; i < sizeof(command); i++)
    {
      command[i] = 0;
    }

Lovely. Insert NULLS in every position because you don't understand what NULL-terminated means.

How fast does loop() operate? Let's assume that no serial data is sent.

  1. Check for serial data (Serial.available() gets called. It computes and returns the difference between two pointers - two uint8_t accesses and a subtraction operation).
  2. Compare the result to 0. It's not greater, so
  3. Decrement tries (an int access, a subtract, and an int store)
  4. Another if test with an int access, a compare, a byte access and another compare
    How long do you think this will take to execute? I'll wait while you figure it out. Remember the Arduino operates at 16MHz, so each operation takes 62.5 nanoseconds.

As one person mentioned before, adding the delay is a "band-aid" solution to the problem. Here's a way I found that will prevent that reliably, though it may still be seen as a crutch to the program.

It is.

Your 'tries' method really just boils down to another form of delay, just a non-blocking (and rather convoluted) version.

What do you see wrong with utilizing a stop character?

Paul, you are coming off as rude, to be honest. I am not a professional programmer, and I'm not calling myself one. As a college student who got into Arduinos a couple of years ago with the Diecimila, I would recommend encouraging new people instead of making them feel/look stupid. Just my two cents.

Why am I flushing out the buffer? Well, I figure that for my purposes, each of my commands would be no longer than thirty characters. Any leftovers in there would be read on the next round and added to the first, but please let me know if I'm missing something here.

I had never even realized C and other languages terminate a string with /0 before you mentioned it. If that's the case, then I'd expect I really don't need to overwrite my character array. Thanks for the tip!

If the Arduino could grab the entire serial data from the computer at one time and store it in the buffer, you could read the buffer one time and one time only. However, it doesn't do that apparently. Since the Arduino cannot accurately guess the length of the next command it receives (arbitrary length in my case, obviously), there doesn't seem to be a "good" way of solving this problem unless you add a stop character. I don't want to add a stop character since my command interpreter is being used directly from a terminal, and that's just one more thing for the user to type, every single time. Could I write a front end for the program to add a stop character? Sure, but that is just more work and it wouldn't be as portable. I wanted this Arduino to be able to run from any modern computer with USB and a terminal program. A stop character seems more like a crutch to me, especially if your Arduino is not having to do a lot of processing. My needs required it to turn on and off a few relays. Not too hard. Thus, I was hardly worried about efficiency except for the ease-of-use for the user.

I agree, this code is not good for programs that require speed. But this seemed to work for me.

antiquekid3:
Since the Arduino cannot accurately guess the length of the next command it receives (arbitrary length in my case, obviously), there doesn't seem to be a "good" way of solving this problem unless you add a stop character.

It's a streaming protocol. It's nothing to do with the Arduino.

That is, there is no inherent way of saying "a command has arrived". Looking for pauses or "it it all arrived in 10 mS it must be a command" is just flaky.

The way things have worked for a long time is that the newline character (generally) is the "end of command" character. So you type "start motor \n" where \n is the newline you get when you press the Enter key. That's your stop character.

So all you have to do in loop is have a reasonable size buffer. Check if Serial.available () is not zero, and if so, get the next character with Serial.read (). If that happens to be a newline then that is your "stop character" and you process the previous command (and then empty out the previous command). If not, you append it to your buffer, checking you don't overflow how much you allocated.

Some systems have a "each key is a command" system where each time you type something, it is interpreted. Of course that tends to limit you to single-character commands.

I agree with Paul that you don't want to flush every time, that is just throwing away pending information. What that is like is opening your email program, deleting everything in your inbox, and then sitting back and waiting for mail. Too bad if mail arrived before you opened the mail program, you just deleted it.

Paul, you are coming off as rude, to be honest.

Wasn't my intention.

I am not a professional programmer, and I'm not calling myself one.

I figured that out. Right away.

Why am I flushing out the buffer? Well, I figure that for my purposes, each of my commands would be no longer than thirty characters.

Then you should be dumping, unread, no more than 29 characters, not the potentially 127 characters in the buffer.

I had never even realized C and other languages terminate a string with /0 before you mentioned it.

Perhaps you should have paid better attention in class. (Now, I am moving closer to that "rude" line, just so you know the difference). This is pretty fundamental.

If the Arduino could grab the entire serial data from the computer at one time and store it in the buffer, you could read the buffer one time and one time only. However, it doesn't do that apparently. Since the Arduino cannot accurately guess the length of the next command it receives (arbitrary length in my case, obviously), there doesn't seem to be a "good" way of solving this problem unless you add a stop character.

Here's your same "packet" without stop markers:

IftheArduinocouldgrabtheentireserialdatafromthecomputeratonetimeandstoreitinthebufferyoucouldreadthebufferonetimeandonetimeonlyHoweveritdoesn'tdothatapparentlySincetheArduinocannotaccuratelyguessthelengthofthenextcommanditreceivesarbitrarylengthinmycaseobviouslytheredoesn'tseemtobea"good"wayofsolvingthisproblemunlessyouaddastopcharacter

Nowhere near as easy to read, is it?

I don't want to add a stop character since my command interpreter is being used directly from a terminal, and that's just one more thing for the user to type, every single time.

From what terminal? The Serial Monitor with 0022 has the ability to add end-of-packet characters automatically.

A stop character seems more like a crutch to me

I guess we'll just have to agree to disagree.

I wanted this Arduino to be able to run from any modern computer with USB and a terminal program.

Don't forget that you'll need to install the necessary drivers for whatever Arduino you are using.

My needs required it to turn on and off a few relays.

You could do this with one letter per relay. Use a capital letter for on, and a lower case for off. Then, each command is exactly one letter long, and you don't need a crutch. I mean end-of-packet marker.

If your incoming data is of a variable length you have two choices

a) Decide that after a certain time with no characters received you parse the command (dodgy and not even remotely good with user input).
b) Use an end character.

As just about every human keyboard interface in history has used a CR and/or LF to determine the end of a command I can't see what the problem is with doing the same, it's second nature to go "type type type return".


Rob

i had a similar problem years ago, and if i remember well, i had to create a very simple protocol, which was making each transmitted frame beginning with a length char. i had to do that because a single frame could potentially contain any value, including the stop character. So basically, what i did was something like that (In pseudo code):

Emitting end point:

  • Check size of data to be sent.
  • Send this size (You have to choose wether this size is stored as byte, word, etc based on the max length of the data you would transmit).
  • Send the real data, with no termination packet.

Receiving end point:

  • Poll data until you have the size (If you're using a byte, poll just one byte, if you're using a word, poll two bytes and assemble them knowing which one is LO / HI).
  • Then loop-poll data for the size you just got. You don't have to test for a terminator.

i don't know if this would work on the Arduino, maybe this would be a mistake, but i will for sure try it when i'll start playing with serial devices.

Pierre.

Poll data until you have the size

Serial data delivery is not guaranteed. Bytes can get lost. What happens if one of the bytes is part of your length data? You get a wrong length, of course. Then, you can never get back in sync.

PaulS:

Poll data until you have the size

Serial data delivery is not guaranteed. Bytes can get lost. What happens if one of the bytes is part of your length data? You get a wrong length, of course. Then, you can never get back in sync.

You're right. So i would establish a small protocol like that.
byte 0 to 3: size of packet (1 byte), expected checksum also used as end delimiter (2 bytes).
byte 3 to n+3: packet (n bytes).
byte n+3 to n+5: delimiter again (2 bytes).
In case of a loss of sync, the detection is easy. Jumping back in sync is a matter of considering each received byte as a starting point again, until incrementally we reach a matching size / delimiter pair on both ends (At most, one whole packet would be dropped while re-syncing, assuming that there was no loss in the next transmission). Right?

Right?

Yes, but by the time you add all this stuff, converting the data to send to ASCII and sending strings is much simpler.

PaulS:
Yes, but by the time you add all this stuff, converting the data to send to ASCII and sending strings is much simpler.

Right PaulS.

Out of curiosity, how would you send ASCII data (Null terminated) for a range of required values going from 0 (included) to 255 (Included)? Would i have to arbitrarily remove a value from this range (So only 255 different values can be sent) and use the arbitrarily chosen value as a delimiter?

Or you could use a "break" character.

Out of curiosity, how would you send ASCII data (Null terminated) for a range of required values going from 0 (included) to 255 (Included)?

I'd send "<0>", "<1>", "<2>", ..., "<253>", "<254>", "<255>". That's ASCII data, including start and end markers, for all values from 0 to 255.

PaulS:

Out of curiosity, how would you send ASCII data (Null terminated) for a range of required values going from 0 (included) to 255 (Included)?

I'd send "<0>", "<1>", "<2>", ..., "<253>", "<254>", "<255>". That's ASCII data, including start and end markers, for all values from 0 to 255.

Hm. i didn't picture it like that. Isn't that a waste of Arduino CPU power decoding the data when consequently heavy?
To get a single byte of data, you have to send a minimum of 3 bytes, and a maximum of 5 bytes? So your frame length is n3 <= real data length >= n5 ?
Why not directly sending something like that according to your example values:
0x00, 0x01, 0x02, ..., 0xfd, 0xfe, 0xff, 0x00, 0x00 ?
So your frame length = nBytes+2, and your delimiter is a 0x00 pair? Is it again related to the data loss fact? What ratio of loss is observable in real-life circuits over a reasonable length of wire?

Why not encoding something like that according to your example values:
0x00, 0x01, 0x02, ..., 0xfd, 0xfe, 0xff, 0x00, 0x00 ?
So your frame length = nBytes+2, and your delimiter is a 0x00 pair?

Because that wasn't what you asked.

Ummm, did I miss something. Aren't we talking about a human typing into a terminal program? If so the only "protocol" needed is

PC
Send every character to the Arduino.

Arduino
Read characters until you get \n
Parse command.

that's just one more thing for the user to type, every single time.

What is? The return? It's the most natural thing in the world to hit ENTER after typing a command.


Rob

A stop character seems more like a crutch to me

If that were the case, then the vast majority of communication protocols (if not all of them) out there rely on some form of this crutch to transmit data reliably. Ethernet TCP and UDP. HTML. XML. RSS. RS-232. RS-485. I2C. Canbus. Devicenet. Email. Text messaging. etc. etc. The list just goes on and on and on. Basically, virtually every form of electronic communications involves some form of start of data and stop of data, whether it's a uniquely identifiable bit or signal pattern, or just a known arbitrary signal state. Some have fairly complicated protocols, others extremely simple. Many stack multiple layers of communications protocols on top of each other, all utilizing their own particular protocol to determine the start and end of a discrete packet of data. So, you may want to rethink your view of the ubiquitous stop character. In fact, the very RS-232 serial protocol you're using to communicate with your Arduino utilizes a start and stop bit in it's own protocol.

That being said, your stop character could quite simply be a carriage return. Since it's fairly standard to have to hit the enter key to execute the command you just typed in, it's hardly an imposition on the user.

Paul, you seem to have jumped to the conclusion that I have already studied C in school. In fact, I'm taking my second semester of Java, and not once have we talked about how strings are terminated. Only EEs take C; ECPEs like myself study Java. Unfortunately for me, the Java classes offered here use Java as a high-level, OO language, as it was intended, and not to teach basic low-level characteristics of strings, such as how they are terminated.

Rob, I didn't realize the new line character was also sent. I just assumed pressing enter in the serial monitor was equivalent to hitting "send." Thank you for the help!

By no means do I think the stop character is a crutch in the protocols you listed. I only think it would be a crutch if I had to send, say, a semicolon after every command. Now that I know the terminal program also sends the carriage return, I'd say the problem is solved.

I just assumed pressing enter in the serial monitor was equivalent to hitting "send."

It is an option in the latest arduino IDE to append or not the cr/lf to the characters sent.