Losing data when reading from the Serial port

Hello everybody,

Although getting data from the Serial port seems to be quite straightforward, I’m still having issues with that. This post is a split of another post (http://forum.arduino.cc/index.php?topic=176890.0)

I want to receive commands on the serial port, coming from a Windows application. Before parsing the command, I must be able to receive many commands. In order to do that, all commands are ended with “^” → When I find the terminator, I process the incoming command and afterwards I get the new command from the serial port.

The sample below is the part of code retrieving data from the serial port.


Using the Serial Monitor, my test case is:
a=1^b=2^c=3^d=4^e=5^f=6^g=7^h=8^i=9^j=10^k=11^l=12^m=13^n=14^o=15^p=16^q=17^r=18^s=19^t=20^u=21^v=22^w=23^x=24^y=25^z=26^


Here is the code:
#define COMMAND_SUFFIX ‘^’

const int SERIAL_BUFFER_SIZE = 512; // make it big enough to hold your longest command
char serialBuffer[SERIAL_BUFFER_SIZE + 1]; // +1 allows space for the null terminator
int serialBufferOffset = 0; // number of characters currently in the buffer

void setup()
{
Serial.begin(9600);
}

void loop()
{
if(serialRead())
{
Serial.print("- Incoming command: ");
Serial.println(serialBuffer);
}
}

boolean serialRead()
{
boolean rc = false;
char c;

while(Serial.available())
{
c = Serial.read();
switch(c)
{
case ‘\r’:
case ‘\n’:
// Discard special characters
break;
case COMMAND_SUFFIX:
// End-of-command received
serialBufferOffset = 0;
rc = true;
break;
default:
if(serialBufferOffset < SERIAL_BUFFER_SIZE)
serialBuffer[serialBufferOffset++] = c;
serialBuffer[serialBufferOffset] = ‘\0’;
break;
}
}

return rc;
}


And here is the result:

  • Incoming command: a=1
  • Incoming command: b=2
  • Incoming command: c=3
  • Incoming command: d=4
  • Incoming command: j=1
  • Incoming command: o=1
  • Incoming command: t=2
  • Incoming command: y=2
  • Incoming command: z=26

What am I doing wrong??? I’m completely lost…

Thanks for your help
Serge

I don't know if is the problem, but I see that your function serialRead() never return false.
Try adding a return false at the end of your function. it can not hurt. :grin:

Not that it would cause your problem but the serial buffer only has to be 5 bytes in size.

I would print less, for every 4-5 bytes you receive you are printing over 20 at the same speed, the code will block before long and as the Serial input buffer is only 64 bytes you may lose characters.

Try removing the Serial.print("- Incoming command: "); line.

The fact that the first 4 prints are correct and then everything turns to crap supports this theory.


Rob

How fast is your input arriving? If your Windows application is sending it as fast as the serial connection will go that is probably the wrong answer, especially given Graynomad's observation that the Arduino is sending more than it receives.

The problem is you are not returning early enough when the data is being processed.

while(Serial.available())
  {
    c = Serial.read();
    switch(c)
    {
      case '\r':
      case '\n':
        // Discard special characters
        break;
      case COMMAND_SUFFIX:
      // End-of-command received
        serialBufferOffset = 0;
        rc = true;
        break;
      default:
        if(serialBufferOffset < SERIAL_BUFFER_SIZE)
          serialBuffer[serialBufferOffset++] = c;
        serialBuffer[serialBufferOffset] = '\0';
        break;
    }
  }

Consider the scenario where the Arduino buffer contains:

e=5^f=6^g=7^h=8^i=9^j=10^

Your loop, using (Serial.available()) is parsing the commands out of the serial stream correctly, but not returning from the processing of the serial data until all the characters have been processed.

So you get serialBuffer looking like the following each time it hits the next ^

e=5
f=6
g=7
h=8
i=9
j=10

now the serial data has all been processed, and you return true, but as you processed all the serial data before returning, you do not give your sketch the chance to display the intermediate values.

To fix this, track the current serialBuffer state (commandStarted, commandComplete, both predicated on finding a ^), and include that in your loop condition. Something like this should work:

boolean serialRead() {
...
bool stillLooking = true;
...
while (Serial.available() && stillLooking) {
<snip>
      case COMMAND_SUFFIX:
      // End-of-command received
        if (serialBufferOffset > 0) stillLooking = false;
        serialBufferOffset = 0;
        rc = true;
        break;
<snip>
}
}

First off, you need code tags around your code. Click the button above the smileys with the # on it to get a set of code tags. Paste your code between them and it will be fully readable here.

  while(Serial.available())
  {
    c = Serial.read();
    switch(c)
    {
      case '\r':
      case '\n':
        // Discard special characters
        break;
      case COMMAND_SUFFIX:
      // End-of-command received
        serialBufferOffset = 0;
        rc = true;
        break;
      default:
        if(serialBufferOffset < SERIAL_BUFFER_SIZE)
          serialBuffer[serialBufferOffset++] = c;
        serialBuffer[serialBufferOffset] = '\0';
        break;
    }
  }

There is nothing that guarantees that each new serial character will have arrived before that while loop gets to serial.available. As soon as it doesn’t, the loop exits whether all the characters in the line are received or not.
Please don’t rely on delays to ‘fix’ it. Delay is blind to the real issue, a bug waiting to happen.

nid69ita:
I don't know if is the problem, but I see that your function serialRead() never return false.
Try adding a return false at the end of your function. it can not hurt. :grin:

It does if there is no serial input. However even so it is not correct. Read my link above.

boolean serialRead() is the function of @sbolla not the Serial.read()
The first time I read the first code, I see no "return rc;" at end of the function

Good afternoon,

first of all, I’d like to thank you all for your answers. This is a summary of what I have learned so far:

@nid69ita:
Indeed, the 1st version of my program was not really good on that point :wink:
(by the way, regarding your second post: I realized my error and changed the code :D)

@Graynomad:
1°) In the sample, the buffer could be reduced to only 5 bytes, but it was an example (in real cases, the commands might be up to 255 bytes)
2°) However, removing the Serial.print("- Incoming command: "); line resolved the problem
Going further in my investigation, I noticed that:

  • changing that line into delay(xxx) did result to exactly the same problem → Does the Serial.print, Serial.println and Serial.write include delay()? This could explain
  • I added some debug information (uncomment #define DEBUG) in order to check if there was a problem with the buffer (i.e.: resetting to quickly to buffer offset to 0) or by reading data and I found out that the code seemed to be OK: the loss of data was caused by the system (the counter bytesRead is consistent)

PeterH
In the final project, I plan to use to fastest speed possible (115200), but for now, I’m only working at 9600

aarondc
You’re entirely right! I used to design my functions in regular programming as having only 1 return statement at the end of my function, but in this case the normal behaviour is to exit as soon as a command is completely received. Instead of continuing to read bytes of the next command as you suggest (with the stillLooking flag), I put a “return true” in the switch statement.

Nick Gammon
I tried you’re code: indeed, it works fine “as-is”. In some case, you will be obliged to let some time to electronics to react to your I/O commands. When simulating that by adding a “delay(100)” statement in the process_data function, things go wrong.
I really have the feeling that the cause is really low-level and the result could be the same if you are using interrupts, I2C, … in your code.


Briefly said:

  • don’t use delays: the system take the necessary time to manage serial inputs
  • as soon as the command is received: process it
  • In debug mode, consider that all “serial.print[ln]” might result in side effects

*** Any delay() explicitly put in the code (whatever the reason) might lead to a weird behavior


This is the sample code reviewed with all your remarks:

// a=1^b=2^c=3^d=4^e=5^f=6^g=7^h=8^i=9^j=10^k=11^l=12^m=13^n=14^o=15^p=16^q=17^r=18^s=19^t=20^u=21^v=22^w=23^x=24^y=25^z=26^
// ~a=1^~b=2^~c=3^~d=4^~e=5^~f=6^~g=7^~h=8^~i=9^~j=10^~k=11^~l=12^~m=13^~n=14^~o=15^~p=16^~q=17^~r=18^~s=19^~t=20^~u=21^~v=22^~w=23^~x=24^~y=25^~z=26^
//#define DEBUG
#define COMMAND_SUFFIX '^'

const int SERIAL_BUFFER_SIZE = 512; // make it big enough to hold your longest command
char serialBuffer[SERIAL_BUFFER_SIZE + 1]; // +1 allows space for the null terminator
int serialBufferOffset = 0; // number of characters currently in the buffer
int bytesRead = 0;

void setup()
{
  Serial.begin(9600);
}

void loop()
{
  if(serialRead()) 
  {
    delay(10);
#ifdef DEBUG
    String rc = "> Incoming command: " + (String)serialBuffer;
    Serial.print(rc);
    Serial.print(" (Buffer offset: ");
    Serial.print(serialBufferOffset);
    Serial.print(", buffer size: ");
    Serial.print(strlen(serialBuffer));
    Serial.print(", bytes read so far: ");
    Serial.print(bytesRead);
    Serial.println(")");
#else
    Serial.println(serialBuffer);
#endif
  }
}

boolean serialRead()
{
  char c;
  
  while(Serial.available())
  {
    c = Serial.read();
    bytesRead++;
    switch(c)
    {
      case '\r':
      case '\n':
        // Discard special characters
        break;
      case COMMAND_SUFFIX:
      // End-of-command received
        serialBufferOffset = 0;
        return true;
        break;
      default:
        if(serialBufferOffset < SERIAL_BUFFER_SIZE)
          serialBuffer[serialBufferOffset++] = c;
        serialBuffer[serialBufferOffset] = '\0';
        break;
    }
  }
  
  return false;
}

I agree, don't use delays. You can simulate time delays without them.

Does the Serial.print, Serial.println and Serial.write include delay()?

It doesn't have a delay() as such, but when the output buffer is full it will block and wait until it can put the character into the buffer, so the end result is the same, no processing is happening and therefore possible missed input characters.


Rob

sbollaerts:
PeterH
In the final project, I plan to use to fastest speed possible (115200), but for now, I'm only working at 9600

I wasn't asking what the serial line speed was, I was asking how fast the input would be arriving. Is it being sent at the maximum speed of the serial interface, and if so why?

sbollaerts:

boolean serialRead()

{
  char c;
 
  while(Serial.available())
  {
    c = Serial.read();
    bytesRead++;
    switch(c)
    {
      case ‘\r’:
      case ‘\n’:
        // Discard special characters
        break;
      case COMMAND_SUFFIX:
      // End-of-command received
        serialBufferOffset = 0;
        return true;
        break;
      default:
        if(serialBufferOffset < SERIAL_BUFFER_SIZE)
          serialBuffer[serialBufferOffset++] = c;
        serialBuffer[serialBufferOffset] = ‘\0’;
        break;
    }
  }
 
  return false;
}

This still has the same problem I tried to point out last time. Arduino running at 16 MHz can easily outrun incoming serial bytes. Just because none are available doesn’t mean you have reached end of line.

If I might suggest?

Use 2 states in loop() so that:

if ( state == 0 ) then if serial.available you read the character and either add it to the buffer or at line end you change the state to 1. This would be a good place to filter out bad characters as well, but that’s as far as I’ll go with that.

if ( state == 1 ) then you do your work on the line of data and when that’s done, change state to 0.

Later on you might try a better trick. Using states you can keep track of what each incoming byte -means- and interpret your data on the fly. I have a sketch that reads a 1 letter plus digit format as command code + data that way, the ‘buffer’ is 1 character and an int to accumulate digits into a value.
With the buffer and interpret method you start making sense of the data after gathering it. With the other way, interpretation is done when end of line (or timeout if you want to be real good) is found, it is done when the buffer method is getting started.

Hello PeterH,

(sorry I'm not native English speaking)
The target project is to build an sms gateway. It will be interfaced wih an appointment management system: in other words, the gateway will receive dozens of sms in a batch at midnight.

In a more realistic way, I would like to support receiving up-to 48 SMS commands in a batch. Since I limit SMS to 140 characters, each command should have a max size of 165 bytes (SMS!phone=99999999999~msg=[140 characters]^ -> 165 bytes), thus 7920 bytes received with a speed between 9600 and 115200 bauds.

Of course I will implement a request-response protocol so that 1 sms is processed at a time. Therefore I think this topic has successfully answered my question; unless my board continues to reboot (because of memory leaks I guess). I assume that the time needed to send "AT" commands to the GSM shield will be less than the time needed to get the new command (therefore empty the UART buffer)

Anyway, You make a great job for "newbies".

Thanks a lot,
Serge