Debug memory issue (serial won't help)

Hi,
After a long time away I had to change our Arduino program and came across a problem I experienced earlier but never solved: mysterious program behaviour. Simplifying the code pretty a lot, this is part of the sketch:

  char b[22]; char d[5]; int s; unsigned long t; int mt;
  while (C.available() > 20) {
    lastAlive=millis();
    // read event (dddd tttttttttt.ttt s\n) and process it
    for (int i=0; i<22; i++) { b[i]=C.read(); }
    b[21] = 0; sscanf(b, "%4s %*5d%5ld.%3d %d", d, &t, &mt, &s); t *=1000; t+=mt;
    if (strcmp(d, "0000") != 0 && s < 2)
      if (processLoop(det_id, det_st, sec)) break;      
  }

In this code ‘C’ is a client socket and what happens is that the while loop is entered. I know this because in another part of the code ‘lastAlive’ is checked and as a result the LCD is changed (or not). ‘processLoop’ however is never called. Now you might say… well that means that ‘d’ gets the value ‘0000’ but that certainly is not the case.

Now I am looking for a way to debug this. The standard response then would be to add debug statements to either give serial output or use the LCD for this. And I’d agree… However… when I add such a statement the problem disappears… As soon as I add something to the code I see output AND processLoop is called. Sometimes unfortunately other parts of the sketch then stop working or change behaviour.

This mystery rings a bell from the old days… if program code no longer does what it should do based on the code, chances are that the program itself is poking in program memory in stead of data memory. I had a sift through the code, but I cannot really find such an error. Under normal circumstances I would turn to something like efence, but I have not found such a thing for the Arduino platform.
I did a Google for this issue and found posts about Valgrind (no avr support), emulino (seems no longer alive), emulare (I couldn’t get it to work), VirtualBoard (seemed to support Java only) and winAVR (which does not like the Arduino code that much and certainly has no knowledge of the libraries).

So I turn to you. Has anyone found a solution for debugging Arduino software for memory leaks and boundary errors? It would really save my day.

F

sscanf(b, "%4s %*5d%5ld.%3d %d", d, &t, &mt, &s);

This statement does not look right to me. There are supposedly 4 values in your input string, and 5 format specifiers, but only 4 variables to write to.

The %*5d%5ld part, in particular, looks strange.

If the data is as well defined as your example shows, sscanf is capable of figuring out the lengths.

It may be that strtok() and atoi() and atol() would be better for parsing the data.

I prefer strtok() in this kind of code, because sscanf, as you are calling it, permits no loss of serial data, which is a bad assumption to be making.

FrankM:

  char b[22]; char d[5]; int s; unsigned long t; int mt;

b[21] = 0; sscanf(b, "%4s %*5d%5ld.%3d %d", d, &t, &mt, &s); t *=1000; t+=mt;

It could be that t and mt seem to be swapped. To be honest I don't know what" %*5d" means but it looks like &t and &mt are the wrong way around.

ofransen:
To be honest I don't know what" %*5d" means but it looks like &t and &mt are the wrong way around.

http://www.nongnu.org/avr-libc/user-manual/group__avr__stdio.html#ga5507d0e1bbfd387fbb2ffcfd8f5dca6f

"Conversions are introduced with the character %. Possible options can follow the %:
a * indicating that the conversion should be performed but the conversion result is to be discarded; no parameters will be processed from ap,"

But my assessment is that this use of scanf() is likely pretty fragile. One thing that's missing is to check the value returned from sscanf(). This should be the number of matched items -- I think 5 in the normal case here. The way the logic is written now could result in bogus values passed to processLoop(), if the string coming into sscanf() is not exactly right.

Personally, I would not simply read 22 characters into b[], but instead use a DFA to look for the '\n' and match the pattern as you read it. The parser here could get out of sync with the input data pretty easily and would not recover. If the sender wrote "\r\n" or threw in the odd OOB character on the stream, the receiver wouldn't have a hope.

PaulS:

sscanf(b, "%4s %*5d%5ld.%3d %d", d, &t, &mt, &s);

This statement does not look right to me. There are supposedly 4 values in your input string, and 5 format specifiers, but only 4 variables to write to.

The %*5d%5ld part, in particular, looks strange.

Thanks for your input. The sscanf part isn't my favourite either, but as far as I know it is correct (I'll get to gardner's response later). Please bear in mind that I only gave this part as an example for my real problem: it seems somthing is corrupting the program memory. If I changes other parts of the code (ie. add debug statements) my builtin webserver stops working - just to give another example.

Just for record a short explanation to the use of the placeholders in the sscanf function:

%4s = scan 4 characters into a string (char*) => d
%*5d = scan 5 decimals and ignore those (this is the meaning of the *)
%5ld = scan 5 decimals into a long => t
%3d = scan 3 decimals into an int => mt
%d => scan a decimal into an int => s

gardner:
One thing that's missing is to check the value returned from sscanf(). This should be the number of matched items -- I think 5 in the normal case here. The way the logic is written now could result in bogus values passed to processLoop(), if the string coming into sscanf() is not exactly right.

You're right, this is a bit dirty. The reason for this is that I wanted to hit two flies in one go (which is a Dutch proverb): process any activity on the line, even if it is wronly formatted. processLoop can handle bogus values. As far as I can see this would not cause a memory poking problem.

Personally, I would not simply read 22 characters into b[], but instead use a DFA to look for the '\n' and match the pattern as you read it. The parser here could get out of sync with the input data pretty easily and would not recover. If the sender wrote "\r\n" or threw in the odd OOB character on the stream, the receiver wouldn't have a hope.

I'd be happy to change this, but for now I already stop at 'DFA' :frowning: which does not ring a bell. And given that the available memory is limited I'd be reluctant to add a lot of code. Any pointes would be appreciated, because you're absolutely right on the 'out-of-sync' issue.

F

Formatting your code a bit:

char b[22]; 
char d[5]; 
int s; 
unsigned long t; 
int mt;

while (C.available() > 20) {
  lastAlive=millis();
  // read event (dddd tttttttttt.ttt s\n) and process it
  for (int i=0; i<22; i++) { 
    b[i]=C.read(); 
  }

  b[21] = 0; 
  sscanf(b, "%4s %*5d%5ld.%3d %d", d, &t, &mt, &s); 
  t *=1000; 
  t+=mt;
  if (strcmp(d, "0000") != 0 && s < 2)
    if (processLoop(det_id, det_st, sec)) 
      break;      
}

You check you have at least 21 characters but then read 22. Potential problem there.

You multiply t by 1000 but t is not initialized. (Edit: oops)

Personally I wouldn’t use sscanf. My eyes water a bit trying to read that.

And given that the available memory is limited I’d be reluctant to add a lot of code.

If you comment out the sscanf line you save 2088 bytes. So that alone is using quite a bit of code.

I dare say you could extract a few numbers from a line “manually” using less code than that.

FrankM:
I'd be happy to change this, but for now I already stop at 'DFA' :frowning: which does not ring a bell.

DFA = deterministic finite automaton. Lots here: Finite-state machine - Wikipedia

It's (amongst other things) a simple approach to handling inputs like reading bytes from a serial like or network stream, where you want to be resilient against odd data, and use a bare minimum of memory.

gardner:
"Conversions are introduced with the character %. Possible options can follow the %:
a * indicating that the conversion should be performed but the conversion result is to be discarded; no parameters will be processed from ap,"

Ah, I learn something every day!

OK, thank you all for your help. I still have a problem with the sketch poking in the wrong places, but this wasn't it... I was a bit blinded by the fact that in this setup it looked practically impossible that the unit receives corrupted data. However... that impossiblity turned out to be a possibility. Pretty simple even: just start the sketch on the wrong moment, while data is received.

I'll be looking for a slim DFA example to implement (@gardner: after I googled the abbreviation I knew what they were, but am stumbling a bit implementing this on an arduino (and I am reluctant to use the library for this)).

In any case: if someone has pointers for the original question.... I am still interested.

F

So, something like this should work:

/* Read event (dddd tttttttttt.ttt s\n) and process it
 *     dddd tttttttttt.ttt s\n
 *     0123456789012345678901
 */
input_chars(int c)
{
static char buf[24];
static int pos = 0;

char d[5];
int s;
unsigned long t;
int mt;


  /* if the incoming char is a \n, then we may have a complete record.
   * validate and parse it.
   */
  if (c == '\n') {
     if (pos == 21) {
         buf[pos] = '\0';

         if (sscanf(buf, "%4s %*5d%5ld.%3d %d", d, &t, &mt, &s) != 4)
             goto reset;
         t *=1000;
         t+=mt;

         if (strcmp(d, "0000") != 0 && s < 2) {
           processLoop(det_id, det_st, sec);
         }
     }

     goto reset;
  }

  if (pos > 20)
     goto reset;

  if (pos == 4 && c == ' ')
     goto keep;

  if (pos == 15 && c == '.')
     goto keep;

  if (pos == 19 && c == ' ')
     goto keep;

  else if (c>'9' || c < '0')
     goto reset;

keep:
  buf[pos++] = c;
  return;

reset:
  pos = 0;
  return;

}

From somewhere in your main code, feed it with your input thusly:

while (input.available())
  input_chars(input.getchar());

And it will do all the work of figuring out the record boundaries and validating the data.

Those who do not like how I used a goto to handle the state transition are free to provide a “better” example.