Yun Randomly hanging

Hi all.

I hope you can help me. I have developed a system to manage and monitor a drying oven. I have 4 humidity and temp probes http://www.dfrobot.com/index.php?route=product/product&keyword=humidity&product_id=912#.VroRauap5l0 and a LCD screen

All works very well accept one big problem. The yun will hang randomly from time to time.

I am not sure what is causing it to hang. At first I thought it was power. So I put it on a UPS to clean-up the power. But it is still hanging.

Here is the live feed of the yun

http://emoncms.org/JacoFourie/Temp

You will see there are spots where I get flat lines as it will just hang and we have to reboot it.

The Arduino side does very little. I have created the control programs in Python code

Here is the Arduino code. I am not using the PID library yet. I will still add that funtion.
The Linux side and the Arduino side interface via the Bridge library. The Python code will workout what TTEMP must be and then the Arduino side will just make sure the temp is above that temp.

There is another Python script that will update the EmonCMS feed every 20 seconds.

#include <Wire.h>
#include <SHT1x.h>
#include <LCD.h>
#include <LiquidCrystal_I2C.h>
#include <Servo.h>
#include <Bridge.h>
#include <dht11.h>
#include <PID_v1.h>


// Specify data and clock connections and instantiate SHT1x objects
SHT1x sht1x_1(5, 10);
SHT1x sht1x_2(6, 10);
SHT1x sht1x_3(7, 10);
SHT1x sht1x_4(8, 10);

dht11 DHT;
#define DHT11_PIN 11

LiquidCrystal_I2C lcd(0x27, 2, 1, 0, 4, 5, 6, 7, 3, POSITIVE);

Servo servo1;


float t1,t2,t3,t4,tavg,lt; 
float h1,h2,h3,h4,havg,lh;
float start_temp =0,end_temp = 0,current_temp = 0,target_temp = 0 ,target_hum = 0;
int servo_cp = 0, servo_np = 0 , screen_time = 5000;
int start_time = 0 , current_time = 0 , delay_time = 500;

char ttemp[8],thum[8];
char tservo[4];
char r1[2],r2[2],r3[2],r4[2];
boolean screen1 = true , screen2 = false;

int chk;


//PID vars
//Define Variables we'll be connecting to
double Setpoint, Input, Output;

//Specify the links and initial tuning parameters
double Kp=60, Ki=0.025, Kd=20;
PID myPID(&Input, &Output, &Setpoint, Kp, Ki, Kd, DIRECT);

int WindowSize = 500;
unsigned long windowStartTime;



void setup()
{
    // Zero out the memory we're using for the Bridge.
    memset(ttemp, 0, 8);
    memset(thum, 0, 8);
    memset(tservo, 0, 4);
    memset(r1, 0, 2);
    memset(r2, 0, 2);
    memset(r3, 0, 2);
    memset(r4, 0, 2);
    
   start_time = millis();
   
   lcd.begin (20,4);  
   lcd.setBacklightPin(3,POSITIVE);
   lcd.setBacklight(HIGH); 
   lcd.home (); 

   //The 4 relays
   pinMode(A0, OUTPUT);
   pinMode(A1, OUTPUT);
   pinMode(A2, OUTPUT);
   pinMode(A3, OUTPUT);
   
   servo1.attach(13);  
   servo1.write(0);
   
   Bridge.begin();
   //server.begin();

   //Init the t values
   Bridge.put("TTEMP", String(0)); 
   Bridge.put("THUM", String(0));
   Bridge.put("R1", String(0));
   Bridge.put("R2", String(0));
   Bridge.put("R3", String(0));
   Bridge.put("R4", String(0));
   
}

void loop()
{ 
   
 
  // Read values from the sensors
  t1 = sht1x_1.readTemperatureC();    
  h1 = sht1x_1.readHumidity();    
  t2 = sht1x_2.readTemperatureC();  
  h2 = sht1x_2.readHumidity();
  t3 = sht1x_3.readTemperatureC();    
  h3 = sht1x_3.readHumidity();    
  t4 = sht1x_4.readTemperatureC();    
  h4 = sht1x_4.readHumidity();

  //Read the box temp and humidity  
  chk = DHT.read(DHT11_PIN);
  lt = DHT.temperature;
  lh = DHT.humidity;

  tavg = (t1 + t2 + t3 + t4) / 4;
  havg = (h1 + h2 + h3 + h4) / 4;

  if (t1 < 0){
    t1 = 0;
  }
  if (t2 < 0){
    t2 = 0;
  }
  if (t3 < 0){
    t3 = 0;
  }
  if (t4 < 0){
    t4 = 0;
  }

  if (h1 < 0){
    h1 = 0;
  }

  if (h2 < 0){
    h2 = 0;
  }

  if (h3 < 0){
    h3 = 0;
  }
  
  if (h4 < 0){
    h4 = 0;
  }

  current_time = millis();
  
  if ( (current_time - start_time) > screen_time) {
    //Toggle the screens
    if(screen1){
      screen1 = false;
      screen2 = true;
    }else{
      screen1 = true;
      screen2 = false;
    }
    start_time = millis();      
    lcd.clear();    
  }
  
  
  //Show the values on the screen
  //Screen 1
  if(screen1) {     
     show_screen(0,0, "T1: " ,t1);
     show_screen(10,0, "H1: " ,h1);
     show_screen(0,1, "T2: " ,t2);    
     show_screen(10,1, "H2: " ,h2);
     show_screen(0,2, "T3: " ,t3);
     show_screen(10,2, "H3: " ,h3); 
     show_screen(0,3, "T4: " ,t4);
     show_screen(10,3, "H4: " ,h4);     
        }
  //Screen 2
  if(screen2){  
     show_screen(0,0,"TT: " ,target_temp);  
     show_screen(10,0,"TH: " ,target_hum);
     show_screen(0,1,"TA: " ,tavg);    
     show_screen(10,1,"HA: " ,havg);
     show_screen(0,2,"LT: " ,lt);
     show_screen(10,2,"LH: " ,lh);        
       }
  
  
     //Send all the values to the Linux side
    Bridge.put("T1", String(t1)); 
    Bridge.put("T2", String(t2)); 
    Bridge.put("T3", String(t3)); 
    Bridge.put("T4", String(t4)); 
    Bridge.put("H1", String(h1)); 
    Bridge.put("H2", String(h2)); 
    Bridge.put("H3", String(h3));
    Bridge.put("H4", String(h4));    
    Bridge.get("TTEMP", ttemp , 8);
    Bridge.get("THUM", thum , 8);
    Bridge.get("TSERVO", tservo , 5);    
    
    Bridge.get("R1", r1 , 2);
    Bridge.get("R2", r2 , 2);
    Bridge.get("R3", r3 , 2);
    Bridge.get("R4", r4 , 2);    

    //Internal Values
    Bridge.put("LT", String(lt));
    Bridge.put("LH", String(lh));
    
    target_temp = atof(ttemp);    
    target_hum = atof(thum);
    servo_np = atoi(tservo);
    int r1int = atoi(r1);
    int r2int = atoi(r2);
    int r3int = atoi(r3);
    int r4int = atoi(r4);

    //Switch the relays on and off  
    digitalWrite(A0, r1int);         
    digitalWrite(A3, r4int);

    //Control the themp of target temp
    if(target_temp != 0){
     if(t1 < target_temp && t1 > 0) {            
        digitalWrite(A1, HIGH);
        digitalWrite(A2, HIGH);        
        Bridge.put("R2", String(1));
        Bridge.put("R3", String(1));        
        }
        else{        
        digitalWrite(A1, LOW);
        digitalWrite(A2, LOW);
        Bridge.put("R2", String(0));
        Bridge.put("R3", String(0));
        }        
    }else {
       digitalWrite(A1, r2int);
       digitalWrite(A2, r3int);       
    }

    if (servo_np != servo_cp){
       move_servo(servo_cp , servo_np); 
    }
      
  //delay(200);
}



//show the test on the screen
void show_screen(int c, int l, String t , float v ){
    lcd.setCursor (c,l); 
    lcd.print(t);
    lcd.print(v , 2);  
}


void move_servo(int from , int to){
  
  int pos;
  if(from < to){
  for (pos = from; pos <= to; pos += 2) { // goes from 0 degrees to 180 degrees
    // in steps of 1 degree
    servo1.write(pos);              // tell servo to go to position in variable 'pos'
    delay(100);                       // waits 15ms for the servo to reach the position
    }
  } else
  {
    for (pos = from; pos >= to; pos -= 2) { // goes from 0 degrees to 180 degrees
    // in steps of 1 degree
    servo1.write(pos);              // tell servo to go to position in variable 'pos'
    delay(100);                       // waits 15ms for the servo to reach the position
    }
  }

   servo_cp = to;
}

How can I check what is causing the problem and how can I stop it from hanging.
I am not sure if it is the Linux or Arduino side that is hanging. I did check that I am not using up memory on the Linux side. I have another Yun here with me running the same code but without the sensors and it keeps on working all the time. I have red that the I2C buss can hang the Ardion. Is that still the case ?

I will not be able to use the controller if it keeps on dong this.

Kind Regards.

jfourie:
I am not sure if it is the Linux or Arduino side that is hanging.

Are you getting updates to your database during a hang? If you are getting updates, but they are all the same, then I would suspect that the Python code is working, and constantly sending the last known values, but the sketch is hung and not updating the values.

Is the LCD screen being updated during a hang? If so, I would suspect that the sketch is running, and the problem is on the Linux side.

You are putting the last known values to the bridge. Using a web browser, load http://arduino.loca/data/get, and you should see all of the last know data values. (substitute your own Yun’s name or IP address for “arduino.local”) If you see the values are changing during a hang, you can be pretty sure the sketch is till running, if they no longer change during a hang, it’s probably the sketch.

You could try adding some Serial.print() statements around the sketch to see if it keeps running, or if it hangs, you will know where it’s hung by looking at the last output.

I have red that the I2C buss can hang the Ardion. Is that still the case ?

It’s not specific to Arduino - I2C is indeed susceptible to hanging up. I2C is also written I2C (the I is squared) or IIC - the full name of the protocol is Inter-Integrated Circuit communications. It was originally designed as a way for integrated circuits to communicate with each other on the same board. In that environment, it works rather well. But now you see it being used for all sorts of other devices external to the main circuit board, and it’s there that I think it falls flat. When used with wires that are longer than a foot or two, I’ve found it to be very sensitive to noise, bus capacitance, and the value of the pullup resistors. As the wires get longer, the issues become greater. They make all sorts of buffers and active line drivers, but while some of them can help, I’ve found that they don’t fully solve the issues.

The issues is that a slave device can pick up noise and enter a bad state. Sometimes it just locks up, and you can recover by setting the SCL and SDA lines to GPIO and bit-banging a few dozen clock cycles, followed by a stop condition. Sometimes it’s necessary to repeat that a few times. But sometimes, when the slave device hangs up, it is holding the SCL line low (clock stretching) and if that happens you are sunk: the only solutions I’ve found are to hit the reset line for the slave device (if it has one) or cycle power to the slave device.

However, these recovery ideas only work if you can actually detect that there is a problem. I do that on the systems I develop by using my own low-level device code, which includes the ability to time out an operation and return an error if the bus locks up. Unfortunately, the Arduino libraries do not provide a feature like that: if the I2C bus locks up, the Wire library locks up as well and there is no way you can detect that in software.

Personally, I try to stay away from I2C whenever I can. I will begrudgingly use it between chips on a board, but if it’s a remote sensor on the end of the wire I stay away from it unless there is absolutely no alternative. I just don’t understand why I2C is so popular - OK, it’s nice that it doesn’t use a lot of wires, but in my mind it’s just not reliable enough for real-world applications. It’s one thing to use it in hobbyist systems where a lockup has no serious consequences and can be fixed with cycling power. But in real commercial/industrial/medical/military applications (my day job) it just isn’t stable enough without going to a lot of effort for error detection and recovery, something that the Arduino Wire library simply doesn’t do. EVERY project I’ve had that uses I2C has had lockup problems of one sort or another, and EVERY one of them have required significant amounts of code to detect such lockups and try to recover from them.

OK, I’m off of my soapbox now… I didn’t mean to rant on that long, but I feel better now that it’s off my chest. :wink:

Hi ShapeShifter. Thanks for the reply.

Some feedback. If it hangs the screen is dead showing no text and I can not reach the linux side in any way.

What we did today was to remove the i2c wires from the LCD screen. I did not update the sketch as the unit is in the field and is 200 km from me. What happened then is I almost burned down the oven. It seems the Ardiono side go stuck and the linux side was still up. If you go check the graphs now you will see.

What I noticed was that the values on the graph stayed the same at 72.99 but the yun was not down as I was still getting values. Then I ssh into the linux side and did a rest-mcu. Bam the temp wen up to 120 degrees. Then I shutdown everything as I knew the ardunio was stuck with the relay high making the temp go up.

I build the controller for monitoring mode in this first prototype, but will be controlling as well. So we tested it and it worked fine for about 2 weeks, and now today this happened. So I know I will have to add some safty features so this does not happen again. But I would love to sort out the hang.

Is hanging a common Ardiono problem ?

OK it is not the I2C bus on the LCD screen. I disconnected it and Flashed a new sketch via Linux and it has just hang. So the next thing I see somebody on here said is the wifi is not stable. It seems this board is very unstable, is it so ?

If I type the command "reboot" it seems the yun also hangs and does not come back.

From what you describe, it seems like the sketch is hanging. But if that were the case, you should still be able to access the Linux side. I've had situations where the sketch crashes, or the Linux side becomes unresponsive, but I don't typically see them both go South at the same time.

jfourie: Is hanging a common Ardiono problem ?

No. It's a common general computer problem. Most of the time, it comes down to a programming error that causes the hang. Sometimes it's hardware, like an I2C slave locking up due to noise on the line. None of this unique to Arduino.

I've used Yuns (and some smaller Arduinos) in several projects, and most of them have been very reliable. I've had a few that have hung up or crashed for one reason or another, but it's inevitably my fault due to something I was doing in the code - and after fixing that, they have been running for weeks or months with no issues.

If you are using any system to run a process that has consequences if it goes out of control (like controlling a heater) you MUST put some safety features in the circuit to prevent problems if the code should hang. For example, for the heater, you should have an independent thermostat acting as a high limit switch that will cut power to the heater if the temperature gets too high. It's important that such safety features be independent of the processor, and not a sensor that the software reads to control the output. Again, the need for something like this is not unique to Arduino

To get better reliability from any microprocessor based system, I will usually turn on the COP watchdog (Computer Operating Properly, but the term varies depending on the particular processor.) This is a piece of hardware in the processor that acts as a timer: the software periodically resets the watchdog timer (for example, at the top of the loop() function) and if too much time goes by between resets (often on the order of seconds) the watchdog hardware will assume the processor is hung and will reset the processor, starting the software up again. I've not used it on the Arduino processors, but it does have that ability.

In the case of the Yun, the '32U4 processors watchdog will only reset the sketch, it won't reset the Linux side. I don't know if there are any hardware or software watchdog features in the Lunux processor.

There are also hardware watchdog chips that can be added to a circuit. These have a pin that needs to be pulsed periodically, and if if is not, it can reset the processor. Some of these watchdog chips can also control an output: in the case of your heater circuit, you could run the relay control through such a chip, and the relay output would only be active as long as the control pin is pulsed periodically - if the pin isn't pulsed for a few seconds, the output is turned off (and the chip could also reset the processor.)

Designing a robust and fault tolerant system is an art, and can be a necessity in a commercial/industrial control system. You won't find many things in the Arduino libraries to support such a fault tolerant architecture as that is not the main focus of the Arduino boards: they are aimed more at hobbyists. You either need to build in such fault tolerance yourself, or move to an industrial control platform and skip the hobbyist boards (like Arduino, Raspberry Pi, BeagleBone, etc.)

jfourie: If I type the command "reboot" it seems the yun also hangs and does not come back.

OK, that could help explain why both sides hang up. Take a look at this topic: How to improve reboot/reset stability

One of the limitations of the Yun is that the serial port used for communications between the Linux and sketch processors is also the boot console for the Linux processor. When Linux boots, there are a couple places where the boot process can be stopped by pressing any key (sending anything to the serial port.) If the sketch is running, and it is making frequent Bridge calls, it will be transmitting often on that port, and that increases the chances of it causing the boot process to be interrupted. When that happens, the Linux side will be unreachable.

That article has several suggestions on how to make the boot process more stable. I've found them to be very helpful. This could help any reset/startup issues, but won't necessarily help crashes while the system is running, and definitely does not eliminate the need for backup safety systems.

Hi ShapeShifter. Thank you for your input. I have now added a watchdog reset in my sketch code. And I have also added a delay in the set-up so when the Linux side reboots it will not hang the boot process. I think that this is what was going on. The wifi is not 100% where the controller is , I get 40% to 60%. So maybe the Linux side does a reboot from time tot time and then the Arduino side is updating the Bridge values. This then causes a lockup on the Linux side and because the linux side is locked up the Arduino side then locks up as it then can not update the bridge values. I have now made some changes and will monitor and give feedback. I see there is a watchdog running on the Linux side. Does that monitor the wifi or what is its job ?

OK so I found this amazing script on the forum.

https://github.com/NicoLugil/wifiMonitor

It now check my wifi en restarts it should I have a disconnect. Will leave the system to run now and all has been running fine now.

Great, you could share your results.

Sorry to say but the yun is still hanging. All went well till about 1am and then it went off line.

This is so frustrating as there is nothing funny in the code. It just reads the temp & humidity and updates the bridge values. En then the Python script saves the values to local db and calls a url to update EmonCMS.

Can the sensors let the controller hang ?

I think the atmega is crashing because it gets out of sram, too many libraries and variables, could you release somenused variables at the end of the sketch? Maybe you are not erasing the garbage and thus overloading the memory.

Hi. Thanks fir the reply. The controller worked fine for about 2 weeks. And then it started to hang. Also if it is a memory problem should the watchdog not reset and restart the controller? It goes totally off and stays off. Only a power cycle will bring it back. Did you have a look at the code? There is not a lot of unused variables in the code.

We removed one of the probes that showed zero value. It has been up for 8 hours now. Let see if it stays up. Why would the Watchdog on the Arduino and Linux side not reboot / reset the system if it hangs ?

jfourie: Why would the Watchdog on the Arduino and Linux side not reboot / reset the system if it hangs?

How are you setting up the Watchdog on the '32U4 processor? What watchdog is there on the Linux side?

Hi ShapeShifter.

Here is the code on the Ardiono side. I have now commented out all code not in use at this stage. I only use 30% of the SRAM now.

I enable the watchdog like this.

wdt_enable(WDTO_8S);

I then reset it a couple of spots in the sketch

#include <Wire.h>
#include <SHT1x.h>
//#include <LCD.h>
//#include <LiquidCrystal_I2C.h>
#include <Servo.h>
#include <Bridge.h>
#include <dht11.h>
// #include <PID_v1.h>
#include <avr/wdt.h>


// Specify data and clock connections and instantiate SHT1x objects
SHT1x sht1x_1(5, 10);
SHT1x sht1x_2(6, 10);
//SHT1x sht1x_3(7, 10);
SHT1x sht1x_4(8, 10);

dht11 DHT;
#define DHT11_PIN 11

//LiquidCrystal_I2C lcd(0x27, 2, 1, 0, 4, 5, 6, 7, 3, POSITIVE);

Servo servo1;


float t1,t2,t3,t4,tavg,lt,j1; 
float h1,h2,h3,h4,havg,lh;
float target_temp = 0 ,target_hum = 0;
int servo_cp = 0, servo_np = 0; // , screen_time = 5000;
int start_time = 0 , current_time = 0 , delay_time = 500;

char ttemp[8],thum[8];
char tservo[4];
char r1[2],r2[2],r3[2],r4[2];
//boolean screen1 = true , screen2 = false;

int chk;


//PID vars
//Define Variables we'll be connecting to
//double Setpoint, Input, Output;

//Specify the links and initial tuning parameters
//double Kp=60, Ki=0.025, Kd=20;
//PID myPID(&Input, &Output, &Setpoint, Kp, Ki, Kd, DIRECT);

//int WindowSize = 500;
//unsigned long windowStartTime;



void setup()
{

    //Delay for letting the linux side get a clean boot.
    delay(5000);
    
  
    // Zero out the memory we're using for the Bridge.
    memset(ttemp, 0, 8);
    memset(thum, 0, 8);
    memset(tservo, 0, 4);
    memset(r1, 0, 2);
    memset(r2, 0, 2);
    memset(r3, 0, 2);
    memset(r4, 0, 2);
    
   //start_time = millis();
   
   //lcd.begin (20,4);  
   //lcd.setBacklightPin(3,POSITIVE);
   //lcd.setBacklight(HIGH); 
   //lcd.home (); 

   //The 4 relays
   pinMode(A0, OUTPUT);
   pinMode(A1, OUTPUT);
   pinMode(A2, OUTPUT);
   pinMode(A3, OUTPUT);
   
   servo1.attach(13);  
   servo1.write(0);
   
   Bridge.begin();
   //server.begin();

   //Init the t values
   Bridge.put("TTEMP", String(0)); 
   Bridge.put("THUM", String(0));
   Bridge.put("R1", String(0));
   Bridge.put("R2", String(0));
   Bridge.put("R3", String(0));
   Bridge.put("R4", String(0));

   //Enable the watchdog
   wdt_enable(WDTO_8S);
   
}

void loop()
{ 

  start_time = millis();  
 
  // Read values from the sensors
      
  h1 = sht1x_1.readHumidity();    
  t1 = sht1x_1.readTemperatureC();
  wdt_reset(); 
  h2 = sht1x_2.readHumidity();
  t2 = sht1x_2.readTemperatureC();
  wdt_reset();
  //t3 = sht1x_3.readTemperatureC();    
  //h3 = sht1x_3.readHumidity();    
  //wdt_reset();    
  h4 = sht1x_4.readHumidity();
  t4 = sht1x_4.readTemperatureC();
  wdt_reset();

  //Read the box temp and humidity  
  chk = DHT.read(DHT11_PIN);
  lt = DHT.temperature;
  lh = DHT.humidity;

  wdt_reset();

  //tavg = (t1 + t2 + t3 + t4) / 4;
  //havg = (h1 + h2 + h3 + h4) / 4;

   

  if (t1 < 0){
    t1 = 0;
  }
  if (t2 < 0){
    t2 = 0;
  }
  //if (t3 < 0){
  //  t3 = 0;
 // }
  if (t4 < 0){
    t4 = 0;
  }

  if (h1 < 0){
    h1 = 0;
  }

  if (h2 < 0){
    h2 = 0;
  }

  //if (h3 < 0){
  //  h3 = 0;
  //}
  
  if (h4 < 0){
    h4 = 0;
  }

  // current_time = millis();
  
  //if ( (current_time - start_time) > screen_time) {
    //Toggle the screens
    //if(screen1){
    //  screen1 = false;
    //  screen2 = true;
    //}else{
    //  screen1 = true;
    //  screen2 = false;
    //}
    //start_time = millis();      
    //lcd.clear();    
  //}
  
  
  //Show the values on the screen
  //Screen 1
  //if(screen1) {     
  //   show_screen(0,0, "T1: " ,t1);
  //   show_screen(10,0, "H1: " ,h1);
  //   show_screen(0,1, "T2: " ,t2);    
  //   show_screen(10,1, "H2: " ,h2);
  //   show_screen(0,2, "T3: " ,t3);
  //   show_screen(10,2, "H3: " ,h3); 
  //   show_screen(0,3, "T4: " ,t4);
  //   show_screen(10,3, "H4: " ,h4);     
  //      }
  //Screen 2
  //if(screen2){  
  //   show_screen(0,0,"TT: " ,target_temp);  
  //   show_screen(10,0,"TH: " ,target_hum);
  //   show_screen(0,1,"TA: " ,tavg);    
  //   show_screen(10,1,"HA: " ,havg);
  //   show_screen(0,2,"LT: " ,lt);
  //   show_screen(10,2,"LH: " ,lh);        
  //     }
  
  
     //Send all the values to the Linux side
    Bridge.put("T1", String(t1)); 
    Bridge.put("T2", String(t2)); 
    Bridge.put("T3", String(t3)); 
    Bridge.put("T4", String(t4)); 
    Bridge.put("H1", String(h1)); 
    Bridge.put("H2", String(h2)); 
    Bridge.put("H3", String(h3));
    Bridge.put("H4", String(h4)); 
       
    Bridge.get("TTEMP", ttemp , 8);
    Bridge.get("THUM", thum , 8);
    Bridge.get("TSERVO", tservo , 5);    
    
    Bridge.get("R1", r1 , 2);
    Bridge.get("R2", r2 , 2);
    Bridge.get("R3", r3 , 2);
    Bridge.get("R4", r4 , 2);    

    //Internal Values
    Bridge.put("LT", String(lt));
    Bridge.put("LH", String(lh));
    
    target_temp = atof(ttemp);    
    target_hum = atof(thum);
    servo_np = atoi(tservo);
    int r1int = atoi(r1);
    int r2int = atoi(r2);
    int r3int = atoi(r3);
    int r4int = atoi(r4);

    //Switch the relays on and off  
    digitalWrite(A0, r1int);         
    digitalWrite(A3, r4int);

    //Control the themp of target temp
    if(target_temp != 0){
     if(t1 < target_temp && t1 > 0) {            
        digitalWrite(A1, HIGH);
        digitalWrite(A2, HIGH);        
        Bridge.put("R2", String(1));
        Bridge.put("R3", String(1));        
        }
        else{        
        digitalWrite(A1, LOW);
        digitalWrite(A2, LOW);
        Bridge.put("R2", String(0));
        Bridge.put("R3", String(0));
        }        
    }else {
       digitalWrite(A1, r2int);
       digitalWrite(A2, r3int);       
    }

    //if (servo_np != servo_cp){
    //   move_servo(servo_cp , servo_np); 
    //}


   current_time = millis();
   j1 =  current_time - start_time;
   Bridge.put("J1", String(j1));
   
   wdt_reset();   
   delay(200);
}



//show the test on the screen
//void show_screen(int c, int l, String t , float v ){
//    lcd.setCursor (c,l); 
//    lcd.print(t);
//    lcd.print(v , 2);  
//}


//void move_servo(int from , int to){
  
//  int pos;
//  if(from < to){
//  for (pos = from; pos <= to; pos += 2) { // goes from 0 degrees to 180 degrees
    // in steps of 1 degree
//    servo1.write(pos);              // tell servo to go to position in variable 'pos'
//    delay(100);                       // waits 15ms for the servo to reach the position
//    }
//  } else
//  {
//    for (pos = from; pos >= to; pos -= 2) { // goes from 0 degrees to 180 degrees
    // in steps of 1 degree
//    servo1.write(pos);              // tell servo to go to position in variable 'pos'
//    delay(100);                       // waits 15ms for the servo to reach the position
//    }
//  }

   //servo_cp = to;
//}

As for the linux side there is a watchdog /dev/watchdog running. If you kill it Linux reboots.

I wasn't aware of the Linux watchdog, I'll have to look into that. Thanks, that could be useful for my own projects. 8)

If you're concerned that the watchdog may not be resetting you as expected, the easy way to test it is to set it up, but disable the code that periodically clears it. It should go ahead and reset you after the timeout period. If not, you may have an issue with the way it's set up?

Hi. When I test the Watchdogs in the development unit all works fine. If I kill the linux one the Linux os reboots. If i do not reset the Arduino side it resets. So my question is this. Is it logical to asume that becuase it did not reset the production unit and only a power cycle got it out of the hang state it is more a physical hardware short or somthing with the one sensor that got the unit stuck. The dev unit here runs the same code. Only diffrenace is it does not have the sensors on it so it returns values less than 0. But the sketch then makes the return values 0. It has never hang. Also since we have removed the seonsor giving 0 values it has not hang yet. So again my question is can a sensor hang a controller?

jfourie: Is it logical to asume that becuase it did not reset the production unit and only a power cycle got it out of the hang state it is more a physical hardware short or somthing with the one sensor that got the unit stuck.

Yes. Especially with I2C sensors!

My thoughts on off-board I2C sensors was clearly stated earlier in this thread. They can easily get in a state that locks up the I2C bus and prevents any further communications. Simply resetting the processor will not cure the problem, as the slave device will still be locked up until it is reset using a hardware reset line (if it has one) or power cycling the slave device.

Depending on the slave I've had some success bringing them back by bit-banging the I2C bus (this is a last resort when a hardware reset pin is not available.) Set the SCL and SDA pins to GPIO outputs. Toggle the SCL pin to send out a few dozen clock pulses (more pulses than there are bits on the longest salve response) and then perform an I2C stop condition: bring SDA low, then SCL high, then bring SDA high (SDA going from low to high while SCL is high is a stop signal.) Repeat this a couple times. In your case, I would try doing it always in setup() before you initialize the Wire library. This way, if the bus locks up, and the sketch resets, it tries to clear the bus before doing anything else. But this won't work if the slave is stuck in a clock-stretching mode (holding SCL low) because nothing on the bus will be able to see you toggling the clock.

Your best bet is a hardware reset line to the I2C sensors, if that possible. This bit-banging is more of a Hail Mary last resort. Another alternative is to have a way of programmatically powering down the remote sensors in an attempt to reset them. Or, get rid of the I2C sensors and use something more robust.

Thanks for the feedback. When you say use a more robist humidity and temprature sensor can you point me to one. This is the most rebust I could find that works with Arduino. Also do you think I could use a Uno as a watchdog. Have a pin on the Uno that if not triggered in the configured amount of time then it will open a naturaly closed relay for 10 seconds to cut the power of the yun and then give it time to come back. I can also have 2 temp probes in the Uno and have it cut the main power feed if things get to hot.

jfourie: When you say use a more robist humidity and temprature sensor can you point me to one.

I've not worked with any humidity sensors, so I can't give any specific recommendations. The only temperature sensors I've used so far are OneWire DS18B20, and those have only been on the workbench so far - not in a practical installation. The systems I work on are typically measuring voltages, currents, accelerations, and other physical properties, and not so much environmental conditions.

Also do you think I could use a Uno as a watchdog. Have a pin on the Uno that if not triggered in the configured amount of time then it will open a naturaly closed relay for 10 seconds to cut the power of the yun and then give it time to come back.

An interesting idea! It's kind of like the external hardware watchdog chip I mentioned before, but this gives it the possibility of actually cycling the power. Interesting.

I can also have 2 temp probes in the Uno and have it cut the main power feed if things get to hot.

Not a bad idea, but I still wouldn't count on that as a high temperature cut-off. I would use a snap-disc thermostat with an appropriate temperature and current rating - mount it near the heat source, and run the heater power (or relay control signal) through the thermostat. If the temperature gets too high, the thermostat opens, and cuts power to the heater. You don't really want ANY software in the safety loop, a purely mechanical limit switch is safest. There are also manual reset versions which might have some appeal in this application?