Program freezes after a while

I am using a standalone atmega8 on a simple custom made pcb, and programming it with the arduino software.

The device I am making is a controller for relays and can also read sensors such as thermistors and ldr's. It is connected to another arduino (the master controller), via half-duplex RS-485 (using the MAX485 RS-232/RS-485 converter IC), and listens for commands from it (to turn on/off an output, to read a sensor etc.). It also has buttons to manually toggle an output.

Everything works perfectly (it accepts commands, executes the requested action and replies back), until the microcontroller freezes. It can't possibly be an SRAM problem, since the code uses no strings, large arrays or other ram eating stuff. This is driving me nuts. I mean, after 100-200 commands it freezes completely. There are also times where it stops responding to serial input but keeps responding to the buttons. Most of the times though it freezes completely (I can tell because there's a heartbeat led which normally blinks every 2 seconds, and even that stops blinking).

I'd really appreciate any help. The program is attached below.

control_node.ino (10.5 KB)

Do you have any way to get debug info out of the system? If not, I'd be inclined to use softwareserial to talk to the other arduino and leave the hardware serial port free for debugging. I'd be interested in the value of switch_count. Depending on the setup in EEPROM, it looks like there is potential for it to be greater than max_switches.

I looked quickly at your code and I had a hard time to understand the code so I cannot give the answer to your question.

your original code has

 if (EEPROM.read((command_in[2] + 30)) == 'i') { // convert pin number to the EEPROM address its mode is saved at & ensure that the pin is an output

do you see the mismatch between code and comment?

You need to spend some time to refactor the code so the names of the variables and functions become more meaningful.
That way you do not need so much comments to explain what everything does.

Also sometimes code looks better when using the switch statement iso multiple ifs

To show what I mean, I have rewritten on of the functions

byte write_output() 
{
  int pin = command_in[2];
  if (pinNotValid(pin)) return 'e';

  int address = pin + 30;
  if (EEPROM.read(address) != 'o')  return 'e';   

  switch(command_in[3])
  {
  case 'h' : digitalWrite(pin, HIGH);
    break;
  case 'l' : digitalWrite(pin, LOW);
    break
  case 't': digitalWrite(pin, !digitalRead(pin));
    break;
  default:
    return 'e';
  }
  return 's';
}

// can be reused 
boolean pinNotValid(int nr)
{
  return (pin< 5 || pin > 13);
}

Next step in readability is use symbolic error names (so you can differentiate between types of errors)

// somewhere in the top of your sketch
#define SUCCES                    0
#define INVALID_PIN_ERROR  -1
#define INVALID_COMMAND -2
#define INVALID_PINMODE  -3
//etc

...

int write_output() 
{
  int pin = command_in[2];
  if (pinNotValid(pin)) return INVALID_PIN_ERROR  ;

  int address = pin + 30;
  if (EEPROM.read(address) != 'o')  return INVALID_PINMODE;   

  switch(command_in[3])
  {
  case 'h' : digitalWrite(pin, HIGH);
    break;
  case 'l' : digitalWrite(pin, LOW);
    break
  case 't': digitalWrite(pin, !digitalRead(pin));
    break;
  default:
    return INVALID_COMMAND ;
  }
  return SUCCES;
}

This way the code needs almost no comment as it becomes quite readable in itself

Try to rewrite your code in this way and see

OK, one extra

int read_digital()
{
  int pin = command_in[2];
  if (pinNotValid(pin)) return INVALID_PIN_ERROR  ;

  int address = pin + 30;
  if (EEPROM.read(address) != 'i')  return INVALID_PINMODE;   

  if (digitalRead(pin) == HIGH) return 'h';
  return 'l';
}

Hope this helps

wildbill:
Do you have any way to get debug info out of the system? If not, I'd be inclined to use softwareserial to talk to the other arduino and leave the hardware serial port free for debugging. I'd be interested in the value of switch_count. Depending on the setup in EEPROM, it looks like there is potential for it to be greater than max_switches.

I am testing the code with a configuration of 3 relays (and 3 buttons to manually control them), 1 temp sensor and 1 ldr. I don't think the problem is caused because of an overflow of that array, since there are 3 of them, and also the problem would probably appear immediately as the array was populated at the beginning of the sketch. There are 9 free pins on the arduino so there can be up to 4 relay/button pairs, so that's why I have put 4 as the maximum number of switches.

Unfortunately I can't use the hardware port for debugging because I can't change the hardware design that easily (it's a homemade etched pcb).

robtillaart:
I looked quickly at your code and I had a hard time to understand the code so I cannot give the answer to your question.

your original code has

 if (EEPROM.read((command_in[2] + 30)) == 'i') { // convert pin number to the EEPROM address its mode is saved at & ensure that the pin is an output

do you see the mismatch between code and comment?

...
Hope this helps

I copied and pasted some parts and probably forgot to change the comments so that's why. I am indeed getting a bit confused while reading through this code and I have already decided to re-write it to make it more readable. Thank you very much for your tips.

Update: robtillaart, I'm applying the changes you suggested and the code already seems much better :slight_smile:

I am still hoping that someone can spot the cause of the freezing.

When the code is rewritten and better readable (and you post it) I will have a 2nd look

I rewrote it, and now it never freezes! At least not after 15.000 commands which is where my tests reached XD

But not everything's right, as the master arduino I used to send the commands froze every 1000 or so commands it sent. I am currently rewriting its code too, and if there still is a problem, I'll post it so you can have a look.

OK tomorrow, as it is late enough for me...

Find a way to get some debugging output though - can you use other pins to softserial or SPI to another arduino to get something on the serial monitor or does your PCB design preclude this?

Example of debugging using SPI or I2C:

Ok, there's something really odd about this problem. It seems it's not related to software at all.

When I'm powering the board from my PC's USB port (through a USPasp programmer), it works normally. However, if I use a 5V 2A AC adapter or a 5V usb charger I've got, the microcontroller eventually freezes after some commands. So it's something related to power supply, but I've never had this sort of problem before.

I also updated the attached sketch on the first post to the latest one.

giorgisp:
if I use a 5V 2A AC adapter or a 5V usb charger I've got, the microcontroller eventually freezes after some commands. So it's something related to power supply

Maybe if your power supply is not producing very clean, well regulated DC then the Arduino could suffering from spikes or out-of-spec voltages which are provoking logic faults. Presumably you're bypassing the onboard regulator, so whatever your wallwart produces goes direct to the processor, noise and all. The current ratings you mention make me suspect these are intended as battery chargers rather than to directly power digital devices, so they may not be well regulated and smoothed.

from your code

  for (byte i = 0; i<=message_length; i++) // send the buffer contents (until we reach message_length)
  {
    Serial.write(reply_out[i]);
    reply_out[i] = 0;
  }

Possible deadlock (causes a freezer) found: message_length is a byte and if messagelength == 255 this loop will never stop.

fix:

  for (byte i = 0; i < message_length; i++)   // use  <    iso    <=
  {
    Serial.write(reply_out[i]);
    reply_out[i] = 0;
  }

robtillaart:
Possible deadlock (causes a freezer) found: message_length is a byte and if messagelength == 255 this loop will never stop.

Thanks for spotting that. However it can't be the cause of the freezes because I am never sending any message longer than 5 bytes.

PeterH:

giorgisp:
if I use a 5V 2A AC adapter or a 5V usb charger I've got, the microcontroller eventually freezes after some commands. So it's something related to power supply

Maybe if your power supply is not producing very clean, well regulated DC then the Arduino could suffering from spikes or out-of-spec voltages which are provoking logic faults. Presumably you're bypassing the onboard regulator, so whatever your wallwart produces goes direct to the processor, noise and all. The current ratings you mention make me suspect these are intended as battery chargers rather than to directly power digital devices, so they may not be well regulated and smoothed.

I also tried it powering it from my laptop's usb port through the programmer (while it was running on battery) and it still froze after some commands. It doesn't freeze if I don't send any commands. I can leave it powered from the ac adapter overnight and it is still running the next morning until I send some commands and it freezes again.

My hardware design indeed feeds the input power directly to the microcontroller. What could I do to filter the power before it reaches the circuit?

I really don't know what to do. It seems very unlikely that this is a software error. I forgot to mention that I'm using the atmega's internal oscillator to save space. However, the master board which sends the commands uses an external crystal oscillator and it freezes too sometimes after sending a lot of commands.

Update: I found some power regulator boards I had purchased from ebay (these ones: http://www.ebay.com/itm/271143310513) with a 12v input and 5/3.3v 3A output. It seems they use good regulators and have capacitors and coils and other filtering components, so I am going to wire up one of them, power the controller board with it and see what happens.

I thought you said you had solved the problem in the receiver. Is it the sender or receiver which is freezing now? Have you posted the current code for that? What hardware connections do you have to it?

PeterH:
I thought you said you had solved the problem in the receiver. Is it the sender or receiver which is freezing now? Have you posted the current code for that? What hardware connections do you have to it?

Both are freezing when I am using some power supplies (although the receiver freezes more often). The code is probably irrelevant to the problem, but you can find the current code for the receiver at the first post.

The good news are that yesterday I used one of these regulator boards I have to power both the sender and the receiver board, and left RealTerm constantly sending a command to toggle an output. Right now is has sent the command 250.000 times and both the sender and the receiver board are still running.

So the problem probably was caused by the power supply, but I don't know exactly why. Even the power from the USB port of a macbook caused problems.