Arduino Nano crashes after some time

Hello,

I am working on a project that uses an Arduino Nano in order to control Peltier modules, fans and pumps so as to cool of a polymer solution so as to c<hange its viscosity. For the interface I use an encoder and an OLED SPI screen (SSH1106) that I control using the U8g2 library. My temperature sensor is a MAX31865 with the Adafruit breakout board (and thus I use the Adafruit library). Apart from this, I don't use any libraries, and no Strings. Here is the schematic:

Everything works fine both regarding hardware and software, however, this only stays true for a bunch of minutes.
After which the Arduino completely freezes/crashes and needs to be rebooted manually.
I would like to try and debug my code, but I don't have any debugging programmers.
I did try to look for over/underflows and memory leaks in my code, but I can't find nor think of any. I am thus wondering if it could be caused by one of the libraries I use.
Here is my code:

#include <Arduino.h>
#include <U8x8lib.h>
#include <Adafruit_MAX31865.h>
#include <SPI.h>

// Comment out to use On-Off Control
//#define USE_PID

#define CONTROL_SKIPS 128   // Wait for x loop() iterations to update control

#define SET_BIT(p,n) ((p) |= (0x1 << (n)))
#define CLR_BIT(p, n) ((p) &= ~(0x1 << (n)))
#define GET_BIT(p, n) (((p) & (0x1 << (n))) >> (n)) 

#define OLED_CS 2
#define OLED_DC 9
#define OLED_RST 7
#define RTD_CS 10
#define ENCODER_A 6
#define ENCODER_B 5
#define ENCODER_P 4
#define FP_MOSFET 8     // Fans & pumps mosfet
#define PELT_MOSFET 3   // Peltier modules mosfet (PWM pin)

#define R_REF 430.0    // PT100 R reference
#define R_NOMINAL 100.0    // PT100 R at 0°C

#ifdef USE_PID

// Comment out if necessary, for example KI and KD for solely proportional control
// Propotional constant
#define KP 2
// Integral constant
//#define KI 5
// Derivative constant
//#define KD 7

#else

// Regulation will start when temp is outside of [desTemp - RANGE; desTemp + RANGE]
#define RANGE 1
// Hysteresis for the control. Regulation will stop when temp is inside of [desTemp - RANGE + HIST_RANGE; desTemp + RANGE - HISTRANGE]
#define HIST_RANGE 0.5

#endif

U8X8_SH1106_128X64_NONAME_4W_HW_SPI u8x8(OLED_CS, OLED_DC, OLED_RST);
Adafruit_MAX31865 rtd = Adafruit_MAX31865(RTD_CS);

// Bitmaps
const byte logo[] PROGMEM = {...};
const byte okay[] PROGMEM = {...};
const byte regu[] PROGMEM = {...};
const byte moon[] PROGMEM = {...};

const int timer = 49911;            // For preloading timer interruption, 65536 - 16MHz/256/4Hz

byte desTemp = 15;
float temp = 20.00;
byte reg = 8;           // 0 for no regulation, 1 for regulation, 2 for sleep, 8 for not yet initialized

/*
 Encoder data byte, uses bitwise operation (from right to left) so as to minimize the number of variables.
 Bits:
 - 0: Last state wheel
 - 1: Current state wheel
 - 2: Direction (0 for CW and 1 for CCW)
 - 3: Last state button
 - 4: Current state button
 - 5 to 7: No use
*/
byte encoder = 0;

ISR(TIMER1_OVF_vect) {
  TCNT1 = timer;
  updateScreen();
}

void setup(void)
{
  // Encoder pins
  pinMode(ENCODER_A ,INPUT);
  pinMode(ENCODER_B, INPUT);
  pinMode(ENCODER_P, INPUT);

  // MOSFETs pins
  pinMode(FP_MOSFET, OUTPUT);
  pinMode(PELT_MOSFET, OUTPUT);

  // Encoder init
  digitalRead(ENCODER_A) ? SET_BIT(encoder, 0) : CLR_BIT(encoder, 0);
  digitalRead(ENCODER_P) ? SET_BIT(encoder, 3) : CLR_BIT(encoder, 3);

  // Timer interrupt
  noInterrupts();
  TCCR1A = 0;
  TCCR1B = 0;
  TCNT1 = timer;   
  TCCR1B |= (1 << CS12);  
  TIMSK1 |= (1 << TOIE1);
  interrupts();

  // RTD init
  rtd.begin(MAX31865_3WIRE);

  // OLED init
  u8x8.begin();
  u8x8.setPowerSave(0);

  // OLED interface
  drawImg(logo, 3, 5, 0, 2);
  u8x8.setFont(u8x8_font_profont29_2x3_r);
  u8x8.drawString(8, 3, "C");
  u8x8.setFont(u8x8_font_8x13B_1x2_r);
  u8x8.drawString(5, 0, "RoboMix");
  u8x8.setFont(u8x8_font_7x14B_1x2_r);
  u8x8.drawString(7, 2, "o");
  u8x8.setFont(u8x8_font_5x8_r);
  u8x8.drawString(0, u8x8.getRows() - 1, "Mesure:");
  u8x8.drawGlyph(u8x8.getCols() - 1, 2, 'E');
  u8x8.drawGlyph(u8x8.getCols() - 1, 3, 'T');
  u8x8.drawGlyph(u8x8.getCols() - 1, 4, 'A');
  u8x8.drawGlyph(u8x8.getCols() - 1, 5, 'T');
  updateScreen();

  // For security reasons, initially set to sleep mode
  reg = 2;
}

void loop(void) {
  static byte cnt = 0;

  cnt++;
  
  updateEncoder();

  if (cnt > CONTROL_SKIPS) {
    temp = rtd.temperature(R_NOMINAL, R_REF);
    updateControl();

    cnt = 0;
  }

  delay(1);
}

#ifdef USE_PID
void updateControl() {
  static unsigned long previousTime = 0;
  static unsigned long lastDelta = 0;
  
  if (reg == 2) {
    digitalWrite(PELT_MOSFET, LOW);
    digitalWrite(FP_MOSFET, LOW);
  } else {
    if (! previousTime) previousTime = millis();
    
    float delta = desTemp - temp;
    
    float res = KP * delta;   // PID output

    #ifdef KI || KD
    unsigned int elapsedTime = millis() - previousTime;
    #endif
    
    #ifdef KI
    res += KI * error * elapsedTime;
    #endif
    
    #ifdef KD
    res += KD * (delta - lastDelta) / elapsedTime;
    #endif

    byte pwm = (int) res;
  
    lastDelta = delta;
    previousTime = currentTime;

    if (pwm < 16) {
      digitalWrite(FP_MOSFET, LOW);
      
      reg = 0;
    } else {
      analogWrite(FP_MOSFET, pwm);
      
      reg = 1;
    }
  }
}
#else
void updateControl() {
  if (reg == 2) {
    digitalWrite(PELT_MOSFET, LOW);
    digitalWrite(FP_MOSFET, LOW);
  } else {
    float delta = temp - desTemp;

    if (delta < 0) delta = -delta;

    if (delta > RANGE) {
      digitalWrite(PELT_MOSFET, HIGH);

      reg = 1;
    } else if (delta < RANGE - HIST_RANGE) {
      digitalWrite(PELT_MOSFET, LOW);

      reg = 0;
    }

    digitalWrite(FP_MOSFET, HIGH);
  }
}
#endif

void updateEncoder() {
  digitalRead(ENCODER_A) ? SET_BIT(encoder, 1) : CLR_BIT(encoder, 1);   // Set current state wheel
  
  if (GET_BIT(encoder, 1) && GET_BIT(encoder, 1) != GET_BIT(encoder, 0)) {
    if (digitalRead(ENCODER_B) != GET_BIT(encoder, 1)) {
      if (desTemp <= 25)
        desTemp++;

      CLR_BIT(encoder, 2);
    } else {
      if (desTemp > 0)
        desTemp--;
        
      SET_BIT(encoder, 2);
    }
  }

  digitalRead(ENCODER_P) ? SET_BIT(encoder, 4) : CLR_BIT(encoder, 4);   // Press button state

  if (GET_BIT(encoder, 4) && GET_BIT(~encoder, 3)) {    // Encoder button passed from 0 to 1, toggle sleep mode
    if (reg == 2)
      reg = 0;
    else
      reg = 2;
  }

  GET_BIT(encoder, 1) ? SET_BIT(encoder, 0) : CLR_BIT(encoder, 0);      // Last = current wheel
  GET_BIT(encoder, 4) ? SET_BIT(encoder, 3) : CLR_BIT(encoder, 3);      // Last = current button
}

void updateScreen() {
  static byte last = 0;
  
  u8x8.setFont(u8x8_font_profont29_2x3_r);
  if (desTemp < 10) {
    u8x8.drawGlyph(3, 3, '0');
    u8x8.setCursor(5, 3);
  } else {
    u8x8.setCursor(3, 3);
  }
  u8x8.print(desTemp);
  
  u8x8.setFont(u8x8_font_chroma48medium8_r);
  u8x8.setCursor(8, u8x8.getRows() - 1);
  u8x8.print(temp);
      
  if (last != reg) {
    switch (reg) {
      case 0:
        drawImg(okay, 4, 4, 11, 2);
        break;
        
      case 1:
        drawImg(regu, 4, 4, 11, 2);
        break;
      
      default:
        drawImg(moon, 4, 4, 11, 2);
        break;
    }
      
    last = reg;
  }
}

void drawImg(const byte bitmap[], byte tWidth, byte tHeight, byte sx, byte sy) {
  byte tmp[8];
  
  for (byte x = 0; x < tWidth; x++) {
    for (byte y = 0; y < tHeight; y++) {
      
      for (byte i = 0; i < 8; i++) tmp[i] = pgm_read_byte(&bitmap[8 * tHeight * x + tHeight * i + y]);
      u8x8.drawTile(sx + x, sy + y, 1, tmp);
      
    }
  }
}

I should note that all of the different bitmaps can be shown multiples times without issues before the crash, temperature variations etc as well... I can change my desired temperature etc... So like I said everything works but only for a period of time. Which is why I am thinking more of a memory leak than an overflow.
But I would be glad to get opinions from other people.

So thank you very much in advance!

Hi, @jubiler
Welcome to the forum.
Thanks for using code tags. :+1:

Thanks for the clear schematic. :+1:
Although inclusion of the display and MAX module would have been good.

What is your temperature sensor, pt100, thermocouple....?

Do you have a DMM?
How much current is going through the LM7805?
Is it getting hot?
Is your 5V stable.

What is your 12V power supply.
Do you have back EMF diodes on your pumps?

How are you powering the peltier?

Can you please post some images of your project?
So we can see your component layout.

Thanks... Tom.. :smiley: :+1: :coffee: :australia:

On an 8 bit processor ?

Hello,

Thanks for your quick answer!
Here are some pictures:



Here is the MAX module: https://www.adafruit.com/product/3328. Thus I do use a PT100 sensor.
The OLED display is similar to this one however it is a chinese clone using an SSH1106 chip (which is an equivalent/replacement to the Adafruit one, it does not change anything code-wise).

Yes I do have a DMM, however current reading is not perfectly significant since since I have another external thermometer hooked to it (for testing purposes, it's autonomous apart from it's PSU, not related to the Arduino nor anything else on the project). But no, it does not get hot at all, the supply voltage is perfectly constant and stable (it is an actual regulator from Mouse, not a chinese one), I also checked it with a scope.

My 12V PSU is an SMPS, and actually gives off 15.4V perfectly, so as to also power the Peltier modules as well. Here it is: https://www.digikey.com/en/products/detail/dcomponents-corporation/AMESP600-15SMNZ/17619977

Yes I do use a flyback diode, it's not on the schematic since I soldered it close to the pumps and not on my prototype PCB. I use 1N5819 schottky diodes I had laying around.

Thanks again.

1 Like

Posting the rest of the images on this answer (I'm a new member, can't pure more than 3 in a message).


Regarding your question, I do believe integers are stored over 2 bytes. However it is true it should at least be declared as unsigned, thank you. Or I should change the prescaler.
However the timer interrupt does work and refreshes the screen. I will still try to change the prescaler soon though.

I see now from the pictures that it is a classic Nano (8bit). Yes. Make it unsigned (uint16_t).
Also, in "preferences" in the IDE, switch on compiler warnings if this option is not already enabled.

I will try that. Thank you
I also got confirmation from the datasheet that this specific register is indeed 16 bits:

The Timer/Counter (TCNT1), output compare registers (OCR1A/B), and input capture register (ICR1) are all 16-bit registers.

However, it's not the case for example with timer 0 (TCNT0 is 8 bits).

You can see the maximum values which the various arduino data types can represent here: Data Types in Arduino - SparkFun Learn

I hadn't seen the C++ ellipsis operator used in this situation before so I tried to see what it would do.

I'm now guessing that you have edited the code with your own symbols to avoid including large amounts of data in your post.

Yes it's kind of what I did but I looked at the ATMega328P datasheet instead.

Yes I removed them since they were pretty big.

I did try changing timer to unsigned however it did not fix anything.
I also tried removing all the instructions from my updateScreen function, except for the current temperature that would still show. It did not fix the problem, however it seems like it took longer for the Arduino to freeze. Even though it's a bit difficult to say since the time it takes to crash is sort of random. From a minute to 5 minutes usually.

Hi,

do you have any chance to get the console output, when the crash happens? If so do you get anything on the console? Like some crash message, some weird output?

I'm not very familiar with what you hooked up to your arduino, but I can code. Looking through your code the static byte cnt seems odd. You generate a static variable every loop (about every 1ms) why? I think this variable never gets deleted, since it's static so maybe that is the problem? Like you fill your RAM with that variable until there is no more RAM left?

Could you like explain what that variable is for and what it shoudl do? It seems like it's probably not doing what it should.
What I think would be correct for your count is, to declare and initialize it like your float temp and the just do

cnt++ at the beginning of every loop and reset it with cnt = 0 in your if statement like you currently do.

If all of this doesn't help I would put Serial.println() all over my programm and see where exactly it crashes.

It is not how static variables work.
It only gets created once, and stays the same between functions call. This is used in order to avoid creating global variables which should limited as much as possible.
Thus, cnt gets incremented each loop call and no new variable is created, until it reaches CONTROL_SKIPS. Then cnt is reset.

Is is more stable when you power the nano through the USB cable ? You'd have to disconnect the +5V from the external regulator side but retain a common ground.

I did try powering it up only through USB however it did not help.
I checked the +5V stability and I don't think it comes from here.
Thanks nonetheless

I was connecting 5V to 9V lipo when I first started. After trying 30-40 times, when I plugged in the battery, the lights first flashed quickly, then I saw that the card went out in 5-10 seconds and I couldn't upload the code anymore, then I realized that the battery was broken, but the card was broken.
the weirdest part was that it worked for a while as the voltage increased.

my advice:
I had the same problem as you mentioned, after the code worked for a while, my card went off as if it had never been under voltage, and one of the reasons for this may be that you connect multiple sensors, upload the code without installing any sensors on your broken card (if the code is installed, you don't need to upload it again) and then connect your sensors one by one and arduino will show you where you exceeded the limit. and change the card and battery after the difference is clear, but this time do not exceed the limit because when you exceed the limit, you will notice that even if you replace the battery and the card, you will notice that it is broken, if your problem is not related to the sensor, replace the card and battery, but this time use less power, because even if your card takes a lot of power at fast intervals (if you plug and unplug it all the time) it may be broken, I wrote both because I tried and solved it before

You probably wouldn't see a power problem unless you have an oscilloscope.
Some things to try.

  • Replace both the Peltier and the pumps with leds and 1k series resistors and see if it still crashes. If not, that indicates a hardware solution is required. Maybe:
  • increase the value of C2 multiple times.
  • cleanly separate the +12v circuit from the +5v circuit. That is power the solid state relay led from 5v and use an opto-coupler in the gate circuit of Q1. Then there is no common ground either. Again power the nano via a USB cable.

I do have a scope like I said in the post I linked. There isn't really any noise or voltage spikes/transients.

I do not think it either comes from the 12V circuit or the fact that the pumps/fans etc have inductance etc since the Arduino still crashes if the 12V circuit is shut down (and thus pumps, fans and Peltier modules have no voltage across them at all) while being powered solely via USB.

Could I debug live if I had an ST-Link or similar JTag debugger?

From what you have described it is then a software issue. I'd then suspect this:
Move the call to updateScreen() into the loop.

Also update the schematic with the screen and other missing components.

1 Like

It does appear to work (still testing).
However I did want to use interrupts in order to not overload the MCU with useless refreshes, since refreshing the screen like once every half a second is way more than enough for my use case.
Why would it cause the Arduino to crash?
Is the timer 1 used for something else? If so then why does it freeze only after quite some time?

Do I really have to resort to use solely the loop, possibly with another counter to skips some more calls, or is there something that needs fixing in my timer interruptions?

I will update the schematic when I'll be back home, my KiCad projects are on my desktop computer.

Thanks.

It was a guess because you are restricted in what you can do safely in an ISR. It could be the that screen library also uses interrupts and these are inhibited within an ISR.

It is standard to use a construct within the loop() like:

static uint32_t lastScreenRefreshAtMs = 0 ;
if ( millis() - lastScreenRefreshAtMs > 500 ) {  // 500ms
  lastScreenRefreshAtMs = millis() ;
  updateScreen();
}

If you still have problems, then post your latest code.

I checked and none uses the timers.
I will try using the ISR solely to set a global flag that will or will not trigger a refresh in the loop.
If it still does not work I will go ahead and use the traditional loop method.

Thanks.