Go Down

Topic: Arduino Randomly Freezing During Long Jobs (Read 2939 times) previous topic - next topic

aibonewt

Hi,

I've built a 6-DOF micro pick 'n' place robot which I will use to assist me in the creation of various digital artworks. My problem is that the machine runs beautifully for many hours, but then freezes for some unknown reason. This stalling always happens after the previous command has been carried out cleanly, and can occur anywhere (according to my last 20+ logfiles) between 29 seconds and 15 hours into the job.

I originally blamed the VB serial port for this (I left a huge thread on the interfacing section of this forum), but after successfully running the code for over three days with just my Arduino UNO parsing instructions via USB from the laptop, I decided it must be an electronics problem. I designed my stepper board using LochMaster, which is great for quick prototyping, but doesn't produce a pretty schematic view so forgive this ugly graphic of my setup.



This is my first time building such a contraption, so I'm hoping that the cause of the freezing will be obvious to someone out there. My main X/Y supply is a 20v 4A Laptop supply, and the 7v for my tool module comes from a 1.3A switching wallwart. And before you tell me those supplies are squarely to blame, I can still get the machine to stall with both of them unplugged and just logic going into the board via the left-hand connector. Also, I took the precaution of disabling BOD on the ATmega328p, but it made no difference.

Thanks for your time,

Thomas

Leon Heller

#1
Dec 01, 2012, 09:27 pm Last Edit: Dec 01, 2012, 09:29 pm by Leon Heller Reason: 1
Spikes from the motors are probably affecting the AVR. Your layout looks very poor - you need to improve the power and ground distribution.
Leon Heller
G1HSM

dc42

Two obvious possibilities:

1. Transients (e.g. from the motors) upsetting the Arduino. See previous reply.

2. A problem in the code. You haven't posted your code so I can only make general suggestions. Are you using the String class, or doing anything else that involves dynamic memory allocation? If so, don't. Is your code robust with respect to unexpected inputs? Is the RAM usage comfortably within the available RAM?
Formal verification of safety-critical software, software development, and electronic design and prototyping. See http://www.eschertech.com. Please do not ask for unpaid help via PM, use the forum.

retrolefty

I don't see enough bypass capacitors mounted on that board layout drawing. If that is true your board is truly acting like a innocent young girl walking unknowingly into the red-light district of real world EMI and digital switching noise. Good engineering practice would have .1ufd caps mounted very close to the Vcc and ground pins of each and every chip and one or two larger (20-100ufd) caps across the Vcc and ground entry points to the board. Don't let Grumpy Mike see that picture.  ;)

Lefty

aibonewt

Thanks for the suggestions!

I'd suspected that there wasn't enough smoothing going on, so last night I started a long test with a 470?F cap across the 5V/GND on the Arduino, which seemed to work better (ie. it stalled after nearly 15.5hrs).

@dc42 - I have no String class in my code, it works fine without my driver board attached and it's well within my available RAM. Thanks for asking.

I understand the comments about motor transients, but I'm still a bit worried that a test I did earlier with only the driver board's 5V/GND connected to the Arduino still managed to fail after a few hours, despite the complete absence of switching loads and motor supplies. I just can't understand how that would happen, especially as there was no other power going to the board.

I will add caps to all my ICs today, and see if that helps...



aibonewt

All chips have decoupling caps, and I've put a 22uF across the 5v. Test failed after 3hrs 24m...

So, here's my code, though I'm sure it's okay (I'll put the xyzservos handler in the next post)

Code: [Select]
//
// MMM_matrix_Plotter3a (pairs with MP3 New VB)
// tjn 20/11/2012
//
// Status: Interleaves X, Y1, Y2 and Z(B)
// Note: 78 Steps/mm using B servo in Full-wave mode, slack = 36

#include <Wire.h>
#include <iox.h>

// Setup pins (0 and 1 reserved for serial I/O)
byte xctrl1 = 2;                // X-Axis geartrain control 1
byte xctrl2 = 3;                // X-Axis geartrain control 2
byte yctrl1 = 4;                // South-Y geartrain control 1
byte yctrl2 = 5;                // South-Y geartrain control 2
byte y2ctrl1 = 7;               // North-Y geartrain control 1
byte y2ctrl2 = 8;               // North-Y geartrain control 2
byte tpower = 9;                // Trolley stepper supply
byte tctrl1 = 10;               // Trolley control 1
byte tctrl2 = 11;               // Trolley control 2
byte xypower = 12;              // XY stepper supply
byte lpower = 13;               // Laser logic supply
byte xsensors = 15;             // X-buffer switches
byte ysensors = 14;             // Y-buffer switches
byte scaleFactor = 1;
byte offset1, offset2, space, cut, gap;
byte xStepIdx, y1StepIdx, y2StepIdx, zStepIdx;
int xSteps, y1Steps, y2Steps, zSteps, totalSteps, subSteps, bitIndex, laserTime, tMult;
int xStore, yStore, zStore, pixCount, iterations, dotCount, max1, max2, iTimeIdx = 0;
char inStr[40];                 // Hold incoming data
byte index = 0;
boolean stringComplete = false; // Data complete flag
boolean xStop = false;
boolean yStop = false;
boolean xDir = false;           // TRUE = Clockwise
boolean yDir = false;
boolean zDir = false;
boolean oxDir = false;
boolean oyDir = false;
boolean ozDir = false;
boolean laserOn = false;
long MsDelay;
unsigned char twoWire[] = {
  B01,B11,B10,B00};             // 2-Wire sequence for X/Y steppers
word fullWaveB[] = {            // Full-wave Slave stepper motor sequence
  0x6000,0x2010,0x18,0x4008};
word lampState = 0x0000;
double m, mx, my1, my2, mz, x, y1, y2, z;

void setup() {
  Serial.begin(38400);         // was 38400
  pinMode(xctrl1, OUTPUT);    // X stepper pin1
  pinMode(xctrl2, OUTPUT);    // X stepper pin2
  pinMode(yctrl1, OUTPUT);    // Y1 stepper pin1
  pinMode(yctrl2, OUTPUT);    // Y1 stepper pin2
  pinMode(y2ctrl1, OUTPUT);   // Y2 stepper pin1
  pinMode(y2ctrl2, OUTPUT);   // Y2 stepper pin2
  pinMode(tctrl1, OUTPUT);    // Trolley pin1
  pinMode(tctrl2, OUTPUT);    // Trolley pin2
  pinMode(lpower, OUTPUT);    // Laser power
  pinMode(xypower, OUTPUT);   // XY stepper power
  pinMode(tpower, OUTPUT);    // Trolley power
  pinMode(xsensors, INPUT);   // X-buffer switches
  pinMode(ysensors, INPUT);   // Y-buffer switches
  digitalWrite(xypower, LOW); // Power-down XY motors
  digitalWrite(lpower,LOW);   // Power-down laser
  digitalWrite(tpower,LOW);   // Power-down trolley
  Wire.begin();               // Start 2-wire communications (Arduino as master device)
  IOX.device(0x74, 16);       // 0x74 is address for Servo A (Pitch)
  IOX.write(0x0080, CFGPORT); // P07=INPUT Set ports LOW to make them OUTPUTS
  IOX.write(0x0000, INVPORT); // Set slave device invert ports to all NON-INVERT
  IOX.write(0x000, OUTPORT);  // Power-down Lamp/Fan
  Serial.println("OK?");
  delay(100);
}

void loop() {
  if (stringComplete) {
    xSteps = atoi(strtok(inStr, "xy"));   // X Transit
    y1Steps = atoi(strtok(NULL, "z"));    // Y Transit
    zSteps = atoi(strtok(NULL, "l"));     // Z Transit
    laserTime = atoi(strtok(NULL, ",o")); // Laser On/Off
    offset1 = atoi(strtok(NULL, "s"));    // Offset1
    space = atoi(strtok(NULL, "c"));      // Space
    cut = atoi(strtok(NULL, "g"));        // Cut
    gap = atoi(strtok(NULL, "i"));        // Gap
    iterations = atoi(strtok(NULL, "o")); // Iterations
    offset2 = atoi(strtok(NULL, ""));     // Offset2

    if (xSteps < 0) xDir = true;
    if (xSteps > 0) xDir = false;
    if (xSteps == 0) xDir = oxDir;
    if (y1Steps < 0) yDir = true;
    if (y1Steps > 0) yDir = false;
    if (y1Steps == 0) yDir = oyDir;
    if (zSteps < 0) zDir = false;
    if (zSteps > 0) zDir = true;
    if (zSteps == 0) zDir = ozDir;
    if (xStop = true && xDir != oxDir) xStop = false;
    if (yStop = true && yDir != oyDir) yStop = false;
    xSteps = abs(xSteps);
    y1Steps = abs(y1Steps);
    zSteps = abs(zSteps);
    y2Steps = y1Steps;


    if (zDir != ozDir) { // Vertical Slack Handler
      zSteps += 37;      // Z-Slack value (from laser deflection test)
    }

    totalSteps = max(xSteps, y1Steps);
    subSteps = min(xSteps, y1Steps);
    m = (double)subSteps/(double)totalSteps;

    if (m > 0.7) { // vector splitter/dog-legger to dodge bad harmonics
      digitalWrite(lpower,LOW);
      if (xSteps > y1Steps) {
        xStore = xSteps;
        yStore = y1Steps;
        xSteps = xStore - y1Steps;
        y1Steps = 0;
        y2Steps = 0;
        digitalWrite(xypower, HIGH);
        xyzServos();
        xSteps = yStore;
        y1Steps = yStore;
        y2Steps = y1Steps;
      }
      if (xSteps < y1Steps) {
        xStore = xSteps;
        yStore = y1Steps;
        y1Steps = yStore - xSteps;
        xSteps = 0;
        digitalWrite(xypower, HIGH);
        xyzServos();
        xSteps = xStore;
        y1Steps = xStore;
        y2Steps = y1Steps;
      }
      // else not used (no adjustment needed when X & Y are equal! 
    }

    if (xSteps != 0 || y1Steps != 0 || zSteps != 0) {
      digitalWrite(xypower, HIGH);
      xyzServos(); // rem-out while testing
    }
    else {
      if (laserTime == 1) {
        digitalWrite(lpower, HIGH); // Laser ON
        IOX.write(0x0200, OUTPORT); // Lamp & Fan ON
      }
      else{
        digitalWrite(lpower,LOW);   // Laser OFF
        IOX.write(0x000, OUTPORT);  // Lamp & Fan OFF
      }
    }

    if (xSteps  == 0 && y1Steps == 0) { // switch OFF motors on end vector
      //digitalWrite(xypower, LOW);     // Machine loses registration on power-down!
      if (laserTime == 0) IOX.write(0x0000, OUTPORT); // Turn lamp & fan OFF
    }

    if (xStop == true) Serial.println("X-buffer Hit");
    if (yStop == true) Serial.println("Y-buffer Hit");
    Serial.println("OK"); // Tell VB Arduino's ready to receive next command from vb   
    stringComplete = false;
    oxDir = xDir;
    oyDir = yDir;
    ozDir = zDir;
  }
}

void serialEvent()
{
  while (Serial.available())
  {
    char inChar = Serial.read();
    inStr[index++] = inChar;    // add to the inStr
    inStr[index] = '\0';        // NULL terminate the array
    if (inChar == '\n')
    {                           // Flag if char is vbcrlf
      stringComplete = true;
      index = 0;
    }
  }
}

aibonewt

Code: [Select]
void xyzServos() {
  if (zSteps > 0 || (xSteps + y1Steps) < 500) tMult = 2000; // Change acceleration profile to suit Z(B)
  else tMult = 1000;            // stepper motor which stalls below 3ms
  max1 = max(xSteps, y1Steps);
  max2 = max(y1Steps, zSteps);
  totalSteps = max(max1, max2);
  mx = (double)xSteps/(double)totalSteps;
  my1 = (double)y1Steps/(double)totalSteps;
  my2 = (double)y2Steps/(double)totalSteps;
  mz = (double)zSteps/(double)totalSteps;
  x = mx;
  y1 = my1;
  y2 = my2;
  z = mz;
  laserOn = false;
  if (cut > 0 || laserTime > 0) {
    IOX.write(0x0200, OUTPORT); // Turn lamp & fan ON
    lampState = 0x0200;
}
  if (cut == 0 && laserTime > 0) digitalWrite(lpower, HIGH);
  for (int i = 0; i < totalSteps; i++) {
    x += mx;
    if (x >= 1 && mx != 0 && xStop == false){ // X-stepper control
      x -= 1.0;
      if (xDir == true) {
        if (xStepIdx == 0) xStepIdx = 4;
        xStepIdx--;
      }
      else {
        xStepIdx++;
        if (xStepIdx > 3) xStepIdx = 0;
      }
      if(twoWire[xStepIdx] & 1<<1){
        digitalWrite(xctrl1,HIGH);
      }
      else {
        digitalWrite(xctrl1,LOW);
      }
      if(twoWire[xStepIdx] & 1<<0){
        digitalWrite(xctrl2,HIGH);
      }
      else {
        digitalWrite(xctrl2,LOW);
      }
    }
    y1 += my1;
    if (y1 >= 1 && my1 != 0 && yStop == false){ // Y1(South)-stepper control
      y1 -= 1.0;
      if (yDir == true) {
        if (y1StepIdx == 0) y1StepIdx = 4;
        y1StepIdx--;
      }
      else {
        y1StepIdx++;
        if (y1StepIdx > 3) y1StepIdx = 0;
      }
      if(twoWire[y1StepIdx] & 1<<1){
        digitalWrite(yctrl1,HIGH);
      }
      else {
        digitalWrite(yctrl1,LOW);
      }
      if(twoWire[y1StepIdx] & 1<<0){
        digitalWrite(yctrl2,HIGH);
      }
      else {
        digitalWrite(yctrl2,LOW);
      }
    }   
    y2 += my2;
    if (y2 >= 1 && my2 != 0 && yStop == false){ // Y2(North)-stepper control
      y2 -= 1.0;
      if (yDir == true) {
        if (y2StepIdx == 0) y2StepIdx = 4;
        y2StepIdx--;
      }
      else {
        y2StepIdx++;
        if (y2StepIdx > 3) y2StepIdx = 0;
      }
      if(twoWire[y2StepIdx] & 1<<1){
        digitalWrite(y2ctrl1,HIGH);
      }
      else {
        digitalWrite(y2ctrl1,LOW);
      }
      if(twoWire[y2StepIdx] & 1<<0){
        digitalWrite(y2ctrl2,HIGH);
      }
      else {
        digitalWrite(y2ctrl2,LOW);
      }
    }   
    z += mz;
    if (z >= 1 && mz != 0){ // Z(B)-stepper control
      z -= 1.0;
      if (zDir == true) {
        if (zStepIdx == 0) zStepIdx = 4;
        zStepIdx--;
      }
      else {
        zStepIdx++;
        if (zStepIdx > 3) zStepIdx = 0;
      }
      IOX.write(fullWaveB[zStepIdx] + lampState, OUTPORT);
    }
    if (cut > 0) { // Put lasing code here..
      bitIndex = (i - (offset1 + space)) % (space + cut + space + gap - 1); // - 1 makes interval agree with vb HorizSteps value!
      if (bitIndex == 0 && i < (totalSteps - offset2)) laserOn = true;
      if (bitIndex == cut) laserOn = false;
      if (laserOn == true) {
        digitalWrite(lpower, HIGH); //rem-out while testing
        delay(laserTime);
      }
      else {
        digitalWrite(lpower, LOW);
        delay(3); // fastest stable transit to next cutting point
      }
    }
    else {
      if (i < totalSteps / 2)
        iTimeIdx = i;
      else
        iTimeIdx = totalSteps - i;
      if (iTimeIdx > 180) iTimeIdx = 180;
      MsDelay = (tMult*(3-(sin((270+iTimeIdx)*PI/180)))) - 1000;
      if (laserTime == 0) delayMicroseconds(MsDelay);
      else delay(laserTime);
    }
    if (digitalRead(xsensors) == HIGH && i > 25) xStop = true;
    if (digitalRead(ysensors) == HIGH && i > 25) yStop = true;
    if (xStop == true || yStop == true) digitalWrite(lpower, LOW); // cut laser power if any limit reached
    if (xStop == true && yStop == true) i = totalSteps;            // bomb-out of loop if both limits reached
  }
  digitalWrite(lpower, LOW);
  //IOX.write(0x0000, OUTPORT); // Turn lamp & fan OFF
}

dc42

The most obvious problem is that in serialEvent, you have a classic buffer overflow if the input received is not as expected. You need to prevent 'index' from incrementing past the last element of the buffer.
Formal verification of safety-critical software, software development, and electronic design and prototyping. See http://www.eschertech.com. Please do not ask for unpaid help via PM, use the forum.

aibonewt

Okay, despite this code having run for three and a half days without issue (when only connected to the USB, anyway =()...

What would be the best approach? Add a conditional to check each byte is good before adding it to the array?

dc42


Okay, despite this code having run for three and a half days without issue (when only connected to the USB, anyway =()...

What would be the best approach? Add a conditional to check each byte is good before adding it to the array?


Add a check that 'index' has not reached the end of the array. If it has, you'll need to decide what to do, e.g. wait until you receive the terminating character, then ignore that data and start again by resetting index to zero.
Formal verification of safety-critical software, software development, and electronic design and prototyping. See http://www.eschertech.com. Please do not ask for unpaid help via PM, use the forum.

MarkT


Thanks for the suggestions!

I'd suspected that there wasn't enough smoothing going on, so last night I started a long test with a 470?F cap across the 5V/GND on the Arduino, which seemed to work better (ie. it stalled after nearly 15.5hrs).

@dc42 - I have no String class in my code, it works fine without my driver board attached and it's well within my available RAM. Thanks for asking.

I understand the comments about motor transients, but I'm still a bit worried that a test I did earlier with only the driver board's 5V/GND connected to the Arduino still managed to fail after a few hours, despite the complete absence of switching loads and motor supplies. I just can't understand how that would happen, especially as there was no other power going to the board.

I will add caps to all my ICs today, and see if that helps...




Firstly those decoupling capacitors are always needed with every logic chip.  Here the ULN's aren't logic chips, they are just amplifiers
in effect so lack of decoupling can't glitch them into the wrong state, but its important to have them to reduce the switching noise on
the supplies, as they are switching large currents.

Secondly we know nothing about the cabling between boards - logic signals should not be routed over long wires without taking
appropriate steps to prevent crosstalk, reflections, etc - so how is everything connected?

Another thing that might occasionally be needed in very noisy environments is adding an extra pull-up resistor on the reset pin (1k or so).

And also have you checked the supply voltages are correct when its operating? - always worth checking just in case there's an
unexpected issue there.
[ I won't respond to messages, use the forum please ]

aibonewt

#11
Dec 07, 2012, 02:03 am Last Edit: Dec 07, 2012, 02:28 am by aibonewt Reason: 1
Okay,

I've tweaked the code so index won't overrun, and put a 10k pull-up on the reset. Still not working for more than a few hours.

Code: [Select]

const unsigned int maxIn = 50;
char inStr[maxIn];              // More than enough for longest command

void serialEvent(){
  while (Serial.available())
  {
    char inByte = Serial.read ();
    switch (inByte)
    {
    case '\n':            // end of command
      inStr [index] = 0;  // terminating null
      stringComplete = true;
      index = 0; 
      break;
    case '\r':            // discard CR
      break;
    default:
      if (index < (maxIn - 1))
        inStr [index++] = inByte;
      break;
    }
  }
}


The thing that still bugs me is that the machine only ever stops AFTER carrying out a command. It gets it from VB, parses it, sends it back for logging, carries it out completely and only THEN freezes. I would've thought that if there were a problem with the wiring it'd just stall at any time, rather than neatly between commands.

I've wasted weeks blaming the serial, and the code, the OS (XPsp3) even the USB drivers themselves. Perhaps I need to put a loop of commands into the Arduino code and run it without using the serial and see if that manages to stay running?


dc42


The thing that still bugs me is that the machine only ever stops AFTER carrying out a command. It gets it from VB, parses it, sends it back for logging, carries it out completely and only THEN freezes. I would've thought that if there were a problem with the wiring it'd just stall at any time, rather than neatly between commands.


Does the PC receive the "OK" after the last command that it carries out completely?
Formal verification of safety-critical software, software development, and electronic design and prototyping. See http://www.eschertech.com. Please do not ask for unpaid help via PM, use the forum.

aibonewt

Ah, now THIS is why I went investigating the Serial port in the first place...

No it doesn't!

The Arduino receives the instruction, parses it, bounces it back to the VB's logfile, carries out the command and then just sits there. What happens internally at this point is still a mystery as the port is locked at this point, and the USB has to be physically un/replugged to restore it. On the other thread I even went to the extent of filming the Tx/Rx lights, but this proved fruitless as the Tx will not flash anyway if the port is locked by the laptop. I quit this line of investigation when I found that the code would run continuously without any attachment to the Arduino other than the USB. Here's my other thread, just for a different angle on the problem.

http://arduino.cc/forum/index.php/topic,129286.0.html


billroy

This is an interesting bug.  It smells like a memory leak or corruption crash to me.  From the other thread, it appears you were using Strings - no longer, right?

To detect a possible leak, it might be worth instrumenting how much RAM is free and printing that out every half hour.  There is a magic function for calculating free ram if you search the forums.

Memory corruption is more likely than memory exhaustion if you have stopped using Strings.  I would Serial.print() the living devil out of the code path between finishing the command successfully and printing OK.  You know it falls off the rails in there somewhere.  One Serial.print per line if that's what it takes.  (Or binary search, if you have a lot of 8-hour test windows…)  If you can figure out which line it fails on, it might help.


-br


Go Up