Program stops running in mid-execution

I built a system to open and close a ventilation flap on my greenhouse with temperature. Very simple: a robust linear actuator is powered for a little longer than needed to fully extend or fully retract, (using the built in limit switches in the actuator to actually turn off at the extremes. The power to the actuator is controlled by 2 relays, which reverse the polarity appropriately to advance or retract the actuator. The system has worked flawlessly and continuously for the past 6 months except for a failure several months ago, which was repeated again today. On both occasions, the flap was closing, (running the ActuatorRetract routine in the code below) when it abruptly ceased running, with the actuator roughly half retracted. Pressing the reset button on the Arduino did not trigger a resumption of the motor in the actuator. But... briefly cutting the power to the Arduino and reconnecting it, did. And it happily went on to fully close, following which the limit switch in the actuator itself cut the power to the actuator, followed a short time later by the Arduino cutting the power to the relays. (And just to clarify, the relays are isolated from the Arduino by opto-couplers, and are powered from a separate power supply, (so no back currents etc.) And the actuator itself is equally separately powered. The "fix" here involved stopping and restarting the power specifically to the Arduino; the power to the relays and to the actuator were not affected. And, to add an additional factor, when the system had frozen in mid-retraction, the LED on the Arduino remained lit, indicating that the Arduino itself was still powered. All of which makes me think the problem is not hardware related, but either a subtle error in my code, or (is this crazy?) the timer in the Arduino having a finite limit before it runs out of room. (That is, does an Arduino need to be completely rebooted at intervals of every few months?)
My apologies for the lengthy explanation. Sketch follows:

//connect red wire of AM2302 to 5V (on left side of board)
// connect white or yellow wire,(data out), to DHTPIN
//connect black wire (ground) to GND pin (on left side)
// connect 10K pull-up resistor between VCC and data

// To run display portion, connect Arduino, then choose Port,(3), under Tools
// Then, also under Tools, choose Serial Monitor, (or Ctrl-Shift-M)

#include <DHT.h>
#include <DHT_U.h>

#define DHTPIN 6
#define DHTTYPE DHT22
#define extendPin 2
#define retractPin 3

DHT dht(DHTPIN, DHTTYPE);

// Variables
boolean varExtended = false;   // to control repetition of action
boolean varRetracted = false;
float temp_Value = 0;  // to store temperature reading
const int extendTemp = 28;
const int retractTemp = 25;


void setup() {
  //set control pins to OUTPUT, and initialise to HIGH. (They will default to LOW otherwise)
  digitalWrite(extendPin, HIGH);
  digitalWrite(retractPin, HIGH);

  pinMode(extendPin, OUTPUT);
  pinMode(retractPin, OUTPUT);

  // Initialize sensor
  dht.begin();
  // Initialize display of readings
  Serial.begin(9600);
}


void loop()
{
  getTemp();

  if (temp_Value > extendTemp && varExtended == false)  // Open flap
  {
    actuatorExtend();
  }

  if (temp_Value < retractTemp && varRetracted == false)  // Close flap
  {
    actuatorRetract();
  }
}   // End of loop


void getTemp()
{
  // Wait 5 seconds between  measurements to permit sensor to read
  delay (5000);

  // Get temperature and print its value.
  temp_Value = dht.readTemperature();

  //Check if read failed and exit, (to try again)
  if (isnan(temp_Value))
  {
    Serial.println("Error reading temperature!");
    return;
  }
  // Serial.print (" %\t");
  Serial.print("Temperature: ");
  Serial.print(temp_Value);
  Serial.println(" °C");
}


void actuatorExtend()
{
  digitalWrite(extendPin, LOW);
  digitalWrite(retractPin, HIGH);   // Closes contacts on forward relay
  delay (40000);                     // Runs actuator out fully
  varExtended = true;              // Prevents routine running repeatedly
  varRetracted = false;            // Should be defaulted to false, but just to be sure
  digitalWrite(extendPin, HIGH);    // Sets both relays off
}


void actuatorRetract()
{
  digitalWrite(extendPin, HIGH);
  digitalWrite(retractPin, LOW);   // Closes contacts on reverse relay
  delay (40000);                     // Retracts actuator fully
  varRetracted = true;            // Prevents routine running repeatedly
  varExtended = false;            // Should be defaulted to false, but just to be sure
  digitalWrite(retractPin, HIGH);    // Sets both relays off
}

A code error should be fixed by a reset, so don't think that's it. If you have to physically interrupt power that seems more like a hardware issue. Maybe the regulator overheated, or the USB chip glitched out and held the board in RESET.

What board are you using?

maxwelldm:
The system has worked flawlessly and continuously for the past 6 months except for a failure several months ago, which was repeated again today. On both occasions, the flap was closing, (running the ActuatorRetract routine in the code below) when it abruptly ceased running, with the actuator roughly half retracted. Pressing the reset button on the Arduino did not trigger a resumption of the motor in the actuator. But... briefly cutting the power to the Arduino and reconnecting it, did.

I'd start with the power, connectors, solder connections.

Could a image of the setup be posted?

I'd also go over power and connections and verify that the actuator is not injecting undue noise into the Arduino.

You might consider adding some debug functionality; you likely don't have a laptop connected all the time so you might think about adding some functionality to the built-in LED that gives an indication (a) that the board is running and (b) where in the code it is.

To make this happen you'll have to consider re-doing the logic to use millis() timing and a state machine for the vent control. You could then have the on-board LED blink at various speeds to help show where the code actually is. For example:

  • in the main loop, have it toggle once a second
  • in the vent-open code, have it toggle once every half-second
  • in the vent-closed code, have it toggle once every quarter-second

If you could add some more LEDs it could be even clearer...

Don't have a loop where the LED flash code cannot be reached or if you do, prepare to ID that as a possible place where the code could get stuck.

I agree with JiggyN: a reset really should clear any software condition including, say, a divide-by-zero trap. The LED blinking would give you an indication as to whether the thing is actually running any code at all.

One thing I notice about your code is that if the temperature sensor returns NaN, you bail out of gettemp() but you still check/use the value in temp_Value in loop(). Think about returning a boolean, true if the value is good and false if it's NaN; then use that return value as a gate to actually using the value in temp_Value.

Seems really odd that the lockup would occur in mid-travel of the actuator. Since you are delaying for 40 seconds to allow ample time for full travel of the actuator, the only thing the Arduino would be doing is handling interrupts for things like the millis() counter and the serial output.

I was going to suggest a problem with the DHT22, since resetting the Arduino would not affect that the same as a power cycle, but I can't really see that locking up the Arduino in the middle of a delay, unless a power spike is locking up both the Arduino and DHT22 simultaneously. Although a possibility is that you are getting power spikes that are causing the Arduino to reset, but are unnoticeable except in the rare case when the DHT22 also locks up.

while i believe the code could be simpler, one possibility is a power glitch simply screwed up the processing. i think a simple verification would be to flash an LED periodically to indicate that the program is running and check that it is flashing the next time this occurs.

you could also flash is differently, some # of quick flashes each second depending on where the code is (e.g. extend, retract, loop()) (yes, you'd need to use millis() instead of delay())

I am just using an Arduino Uno. The suggestions are excellent, and I am most grateful to everybody.
I don't think Blackfin's suggestion of noise from the actuator is likely to be the problem, because I went overboard in isolating the Arduino from everything possible - it has a wholly separate power supply, is isolated from the relays by an opto-coupler, and equally has no connection with the power supply to the actuator.
I like the idea of flashing an LED as a signal of where the program might have stalled. In particular the guidance to use millis() for delay timing is extremely valuable. (I am very much a newbie here, and not familiar with best practices.) I did wonder whether there might be hidden interrupts in the timer which might not be apparent...
I am less convinced that the problem lies with the physical electrical connections because when the system freezes, the LED on the Arduino which indicates that it is powered remains lit - it seems to me that the issue lies in the timer, which is going awry occasionally. (And the concept of voltage spikes etc. throwing it off sounds entirely plausible.) The Arduino is powered by a plug-in switching power supply cube, and hence random spikes in the 110V line might, I suppose, propagate through into the 9V output from the cube. (I don't know how closely regulated the output from these little supplies is.)
I didn't post the circuit diagram because I wasn't sure how to create such. (Does one draw it out by hand and then photograph it? Photograph the physical layout? Use some drawing program to generate a proper circuit diagram?)

I don't think Blackfin's suggestion of noise from the actuator is likely to be the problem, because I went overboard in isolating the Arduino from everything possible - it has a wholly separate power supply, is isolated from the relays by an opto-coupler, and equally has no connection with the power supply to the actuator.

So has the actuator got any components to suppress the interference it generates? Interference can be transmitted over the air like a radio transmitter, it does not have to be conducted by direct connection of wires.

It is in fact radiated interference. However with such a long time between malfunctions it could be something like an un-suppressed motor bike going past or a Taxi transmitter going off at a circuital time.

I would use the watch dog timer to ensure a reset if the code hangs.

maxwelldm:
I didn't post the circuit diagram because I wasn't sure how to create such. (Does one draw it out by hand and then photograph it? Photograph the physical layout? Use some drawing program to generate a proper circuit diagram?)

Drawing by hand is fine, as well as any real schematic diagram generated by your choice of software. Avoid the more graphical type diagram that uses pictures of components instead of component symbols, such as the commonly seen Fritzing diagrams. Pictures of the actual hardware are always good when there is an elusive problem such as this.

I am less convinced that the problem lies with the physical electrical connections because when the system freezes, the LED on the Arduino which indicates that it is powered remains lit - it seems to me that the issue lies in the timer, which is going awry occasionally.

The power LED does not really tell you anything, except that there is power in some form. A very noisy power supply, or even a supply that has brief intervals where the voltage drops to 0 volts, can still light the LED with an unnoticeably slight drop in brightness.

Can you plug an incandescent light into the same outlet as the Arduino power supply, and see if the light blinks when the actuator starts up or is running? That is an easy way to detect when you are getting a good size spike from a motor starting.

How are you making connections to the UNO, jumper wires or a shield plugged into the UNO? My preference for a long-term project would be a Nano without headers, so that the wiring can be soldered in place.

Based on my past bad experience with the little blue Chinese relays, I bet that the OP is using the same type of relays, rather than the industrial type he should be using.
Paul

Here's something you can try that does the LED blinking I mentioned. It also doesn't use blocking so if the Arduino is actually running code, you should be able to tell the next time the problem occurs.

You should, of course, verify that the program still performs as you intended.

//connect red wire of AM2302 to 5V (on left side of board)
// connect white or yellow wire,(data out), to DHTPIN
//connect black wire (ground) to GND pin (on left side)
// connect 10K pull-up resistor between VCC and data

// To run display portion, connect Arduino, then choose Port,(3), under Tools
// Then, also under Tools, choose Serial Monitor, (or Ctrl-Shift-M)

#include <DHT.h>
#include <DHT_U.h>

#define TIME_READ_TEMP      5000ul
#define TIME_EXTEND         40000ul
#define TIME_RETRACT        40000ul

#define DHTPIN 6
#define DHTTYPE DHT22

DHT dht(DHTPIN, DHTTYPE);

const uint8_t
    pinLED = LED_BUILTIN,
    extendPin = 2,
    retractPin = 3;
const float
    extendTemp = 28.0,
    retractTemp = 25.0;
const uint32_t
    grLEDDelays[] = { 0ul, 250ul, 500ul, 1000ul, 2000ul };

// Variables
float 
    temp_Value = 0;  // to store temperature reading

uint32_t
    timeNow,
    timeTemp,
    timeAct;
bool
    bValid = false;

enum
{
    ST_INIT=0,
    ST_CLOSING,
    ST_CLOSED,
    ST_OPENING,
    ST_OPEN  
};

void setup( void ) 
{
    //set control pins to OUTPUT, and initialise to HIGH. (They will default to LOW otherwise)
    digitalWrite(extendPin, HIGH);
    digitalWrite(retractPin, HIGH);
    
    pinMode(extendPin, OUTPUT);
    pinMode(retractPin, OUTPUT);
    pinMode( pinLED, OUTPUT );
    digitalWrite( pinLED, LOW );
    
    // Initialize sensor
    dht.begin();
    
    // Initialize display of readings
    Serial.begin(9600);
    
}//setup


void loop( void )
{
    static uint8_t
        state = ST_INIT;
        
    timeNow = millis();
    if( (timeNow - timeTemp) >= TIME_READ_TEMP )
    {
        timeTemp = timeNow;
        bValid = getTemp();
        
    }//if
                    
    switch( state )
    {
        case    ST_INIT:
            //initialize to "home" the flap closed
            digitalWrite(extendPin, HIGH); //extend relay off
            digitalWrite(retractPin, LOW); //retract relay on
            timeAct = timeNow;
            state = ST_CLOSING;                
            
        break;
        
        case    ST_CLOSING:
            if( (timeNow - timeAct) >= TIME_RETRACT )
            {
                digitalWrite(retractPin, HIGH); //retract relay off
                state = ST_CLOSED;                
                
            }//if
            
        break;
        
        case    ST_CLOSED:
            if( bValid )
            {
                bValid = false;
                if( temp_Value > extendTemp )
                {
                    timeAct = timeNow;
                    digitalWrite(extendPin, LOW); //extend relay on
                    state = ST_OPENING;
                    
                }//if
                    
            }//if
            
        break;

        case    ST_OPENING:
            if( (timeNow - timeAct) >= TIME_EXTEND )
            {
                digitalWrite(extendPin, HIGH); //extend relay off
                state = ST_OPEN;                
                
            }//if
            
        break;

        case    ST_OPEN:
            if( bValid )
            {
                bValid = false;
                if( temp_Value < retractTemp )
                {
                    timeAct = timeNow;
                    digitalWrite(retractPin, LOW); //retract relay on
                    state = ST_CLOSING;
                    
                }//if
                    
            }//if
        
        break;              
        
    }//switch

    //blink the LED according to the state
    LEDIndicator( state );
    
}//loop

void LEDIndicator( uint8_t state )
{
    static bool
        bLEDState = false;
    static uint32_t
        timeLED,
        timeLED_Delay;

    timeLED_Delay = grLEDDelays[state];
    
    uint32_t timeNow = millis();
    if( (timeNow - timeLED) >= timeLED_Delay )
    {
        timeLED = timeNow;
        bLEDState ^= true;
        digitalWrite( pinLED, bLEDState );        
        
    }//if
    
}//LEDIndicator

bool getTemp()
{
    // Get temperature and print its value.
    temp_Value = dht.readTemperature();
    
    //Check if read failed and exit, (to try again)
    if( isnan(temp_Value) )
    {
        Serial.println("Error reading temperature!");
        
        return false;
        
    }//if
    else
    {
        // Serial.print (" %\t");
        Serial.print("Temperature: ");
        Serial.print(temp_Value);
        Serial.println(" °C");

        return true;
        
    }//else
    
}//getTemp

It's odd that reset doesn't fix it. The fact that it failed on retract both times suggests electrical issues but twice in six months isn't exactly conclusive.

I've seen projects where there was something electrically noisy (fridges especially) that caused trouble for the Arduino when they started up. A decoupling cap helped in that situation. Maybe the long duration between failures is because several things need to happen at the same time, including retraction to cause a problem.

It will be difficult to debug because it'll be so hard to be sure you've fixed it. I'd just go with Grumpy_Mike's solution.

I am blown away by Blackfin's willingness to go far beyond providing advice, (which is all I hoped for), to actually writing code for me. I am profoundly grateful. But I got a lot more out of this intervention than possible solutions to my particular problem. Blackfin's coding style is exceedingly elegant, and I have learned a lot from it. (I feel it should be posted somewhere as a tutorial for newbies like me, ie. not specific to this issue, but as a model of style. But I am not sure how this could be done...) In particular it has become clearer to me that I need to pay a lot more attention to the choice of data types to represent variables. (I was also taken with the commenting on the close of each loop to keep track of ends. I have always relied on my editor (in programming a database program, which is my sole prior experience), - it automatically inserts a closing bracket - and progressive indentation, to keep track. But this is not entirely reliable. And marking each one makes it much easier to catch missing closes.) I thank you.

This topic was automatically closed 120 days after the last reply. New replies are no longer allowed.