Program loop Crashes randomly

Hiya!

So my code crashes randomly, each time at different parts in the loop.
It can take about an hour to crash, more often it takes a few seconds or minutes to crash.

After some googling, I figured it might have something to do with dynamic memory allocation. So I brought the global variables from 15% of dynamic memory, to 9% using F().

But it still crashes.
I´ve activated WDT, but even if it works fine, that’s a crappy solution.

So then, I´ve turned off my subloops one by one, but

I´d be quite happy if someone would point me to some resources that might help me figure out what´s going on. :o

PS. code exceeds 9000 characters that I´m allowed to post, so I’m including the code in attachments

a_setup.ino (6.27 KB)

b_loop.ino (458 Bytes)

d_cloudTimeSet.ino (1.04 KB)

e_temperatureSensors.ino (2.57 KB)

f_circulationFlow.ino (2.38 KB)

g_solenoidControl.ino (3.81 KB)

h_heatingCoilControl.ino (3.29 KB)

i_calcs.ino (2.7 KB)

j_cloudWrite.ino (3.66 KB)

SolarHeatingControl_Mega256_v0.4.ino (1.04 KB)

Is which Arduino board you are using a secret?

Paul

So you think someone is going to download 10 different parts of a sketch? How about you combine them all together and have 1 attachment? It will make more people willing to take a look at your code.

Paul_KD7HB:
Is which Arduino board you are using a secret?

Paul

Hi! Thanks for taking the time to reply.

Arduino Mega2560

blh64:
So you think someone is going to download 10 different parts of a sketch? How about you combine them all together and have 1 attachment? It will make more people willing to take a look at your code.

Well I suppose not even though each one of those is a tab which greatly enhances readability in arduino IDE.
Please find attached a combined version :slight_smile:

SolarHeatingControl_Mega256_v0.4_combined.ino (27.2 KB)

Well, there are three constants you are declaring as integers, but assigning values that exceed 32767.

david_2018:
Well, there are three constants you are declaring as integers, but assigning values that exceed 32767.

I see what you mean.
I´ll have to declare those long

const int minDetectionDuration = 40e3; //minimum duration of overheat temperature detected, for overheatStatus to trigger.
const int overheatMinProcessDuration = 20 * 10 * pow(3, 10); //minimum duration of a triggered overheatStatus
const int minHeatCycleDuration = 2 * 60 * 10e3; //minimum duration of coil heating. to keep contactors from vibrating.

You are also detaching and re-attaching an interrupt, when the only variable used from the interrupt is a byte. Normally you would just disable/enable interrupts, but that really isn't necessary with a byte value, since a single-byte operation can't be interrupted.

Generally with random-time crashes the first thing to look at is Strings. Wouldn't think that would be a big problem with over 7k of memory available, but I have not looked at the sketch enough to see how much memory the functions need, or if any of the libraries are using a lot.

david_2018:
You are also detaching and re-attaching an interrupt, when the only variable used from the interrupt is a byte. Normally you would just disable/enable interrupts, but that really isn't necessary with a byte value, since a single-byte operation can't be interrupted.

Generally with random-time crashes the first thing to look at is Strings. Wouldn't think that would be a big problem with over 7k of memory available, but I have not looked at the sketch enough to see how much memory the functions need, or if any of the libraries are using a lot.

The interrupts are part of a code for a hall sensor. I´m sure I can do without interrupts in there, then.

The String part is what I was thinking as well.
I´ve already used F() to keep the static chars away from memory.
I suppose I could move a bunch of globals to local functions.

The real problem may lie in the fact that I´m using a string to communicate over serial with a nodemcuESP, perhaps?
I got that from this guy: Serial Input Basics - Programming Questions - Arduino Forum

These crashes are often from overwriting an array bound, but I don’t see any arrays in your code. As david_2018 says, these Strings are likely the problem. Concatenation of cloudPackage is not required as it looks like you are sending with a start and end marker.

String sensorT1Val = String(sensorT1Value);
    String sensorT2Val = String(sensorT2Value);
    String sensorT3Val = String(sensorT3Value);
    String normalHeatingStat = String(normalHeatingStatus);
    String emergencyHeatingStat = String(emergencyHeatingStatus);
    String overheatStat = String(overheatStatus);
    String maintenanceStat = String(maintenanceStatus);
    String solarHeat = String(solarHeating);
    String cloudPackage = "<" + sensorT1Val + "," + sensorT2Val + "," + sensorT3Val + "," + normalHeatingStat + "," + emergencyHeatingStat + "," + overheatStat + "," + maintenanceStat + "," + solarHeat + ">";
    Serial.println(cloudPackage);
    Serial1.println(cloudPackage);
    Serial.println();

There are a few variables that you declared as global variables, then declare again inside of a function. Declaring the variable inside a function creates a new variable, that is only accessible inside the function, and ceases to exist when you return from the function.

//partial list of global variables//
int moneySpentDayTotal; //give the thing a name!
int electricityPrice; //in 4 decimals please The cost of electricity in R4/kWh should be read form the google sheet, from the value of a particular cell.
int overheatStatusOn; //give the thing a name!
int maintenanceStatusOn;  //give the thing a name!
int solenoidStatusON; //give the thing a name!
byte dumpStatusCalc;  //if last cloud dump failed, this is 0.
float totalSessionFlow; //in liters
int waterSpeed; //in liters/minute
int deltaT; //median difference in temperature between T1 and T2.
float solarKWH; //solar energy aquired in kWh
void calcs()
{
  normalHeatingCumulative();  //minutes that normalHeatingStatus has been on, since the last cloudWriteInterval
  emergencyHeatingStatusON; //minutes that emergencyHeatingStatus has been on, since the last cloudWriteInterval
  float heatingStatusON = normalHeatingStatusON + emergencyHeatingStatusON; //total minutes that HeatingCoils were on, since the last cloudWriteInterval
  electricKWH = (heatingStatusON / 60) * (6000 / 1000); //Energy use, Day total, in Kilo Watt Hour
  int moneySpentDayTotal = electricKWH * electricityPrice; //Money spent, Day total, on heating coil electricity, in R$. The cost of electricity in R4/kWh should be read form the google sheet, from the value of a particular cell.
  int overheatStatusOn; //minutes that overheatStatus was on, since the last cloudWriteInterval
  int maintenanceStatusOn; //minutes that maintenanceStatus was on, since the last cloudWriteInterval
  int solenoidStatusON = overheatStatusOn + maintenanceStatusOn; //total minutes that any of the Solenoids were powered , since the last cloudWriteInterval
  int solarKWH; //when timeStopCoil, store measurement sensorT1Value. when timeStartCoilHour, store measurement sensorT1Value. Subtract 1 from other.
  circulationFlow();
  circulationDT();
  energyInflux();

}

Thanks for looking into it cattledog.

By avoiding concatenation, you mean there is a better way to send data over serial to the nodeMCU, right?

I did some searches and came up with this, as a substitution for the lines you mentioned:

    Serial1.print("<");
    Serial1.print(sensorT1Value);
    Serial1.print(",");
    Serial1.print(sensorT2Value);
    Serial1.print(",");
    Serial1.print(sensorT3Value);
    Serial1.print(",");
    Serial1.print(normalHeatingStatus);
    Serial1.print(",");
    Serial1.print(emergencyHeatingStatus);
    Serial1.print(",");
    Serial1.print(overheatStatus);
    Serial1.print(",");
    Serial1.print(maintenanceStatus);
    Serial1.print(",");
    Serial1.print(solarHeating);
    Serial1.print(">");
    Serial.println();

As far as I can see the outcome is the same, but lots less complexity, and lower memory use ditching all those Strings. Thanks!

Oh my you guys are great!

david_2018:
There are a few variables that you declared as global variables, then declare again inside of a function. Declaring the variable inside a function creates a new variable, that is only accessible inside the function, and ceases to exist when you return from the function.

You've solved another mystery situation in this cobbled-together-raft I built of code.

As far as I can see the outcome is the same, but lots less complexity, and lower memory use ditching all those Strings. Thanks!

Has it solved the random crashes?

nope, it has not solved the crashes unfortunately.

It now crashes within 20 minutes each time.
I’m attaching an updated, combined version of the program.
It has the edits you guys suggested so far.

SolarHeatingControl_Mega256_v0.5_combined.ino (23.1 KB)

Are the "crashes" still random? What do your serial prints tell you about where it hangs? If you disable the watchdog, and print when you enter different functions you should be able to tell were things stop.

EDIT: If you disconnect the solenoids and heaters, do you still get the crashes?

A few things that show up if you turn on all the compiler warnings:

Line 14 - not an error, but [12] is needlessly long, 5 is sufficient to allow for terminating null and á which is not standard ascii

char daysOfTheWeek[7][12] = {"Dom", "Seg", "Ter", "Qua", "Qui", "Sex", "Sáb"};

Line 129: collectorsEmpty is already declared as a global variable of type boolean, you are declaring again in setup()
also, in the rest of the code you set its value, and do comparisons, using 0 and 1 instead of true/false.
Will work, but confusing.

  byte collectorsEmpty = 0;

Line 163: calculation is done as integer, exceeds maximum value of integer.
Change the 24 to either (long)24 or 24l to force equation to use a long.

 if (chronoCloudTimeSet.elapsed() >= (24 * 60 * 60 * pow(10, 3)))

Line 202: sensorTInterval should be unsigned int to match the type of the number it is being compared to later
not really a problem, I just don’t like seeing the compiler warning

int sensorTInterval = 5e3;

Line 446: Statement does nothing, comment it out

 emergencyHeatingStatusON; //minutes that emergencyHeatingStatus has been on, since the last cloudWriteInterval

Line 450, 451, 453 - same as above

 overheatStatusOn; //minutes that overheatStatus was on, since the last cloudWriteInterval
 maintenanceStatusOn; //minutes that maintenanceStatus was on, since the last cloudWriteInterval
 solarKWH; //when timeStopCoil, store measurement sensorT1Value. when timeStartCoilHour, store measurement sensorT1Value. Subtract 1 from other.

Starting at Line 494
Last line, you have a global variable solarKWH, by declaring it again here you are throwing away the result.
sHWater and heatAdded should be declared as float, using byte will only save the whole number, the fractional part gets dropped.

//calculated, archived, and reset every 5 minutes?
void energyInflux()
{
  //Specific Heat is the amount of energy required per unit mass to raise it one degree Celsius.
  //The specific heat of water is known. It is 1 calorie/gram °C = 4.186 joule/gram °C
  byte sHWater = 4.186;
  //Q(heatAdded)=c(specific heat)*m(mass)*deltaT(temperature difference)
  byte heatAdded = (sHWater * (totalSessionFlow / 1000) * deltaT);  //
  byte solarKWH = (heatAdded / (3.6 * pow(10, 6)));
}

Something else I notice, on line 46, do you really want 20 * 10 * 3^10 (three to the 10th power, or 59049)? Also, pow() returns a floating point number, so it may get rounded off slightly in the conversion to integer.

const long overheatMinProcessDuration = 20 * 10 * pow(3, 10); //minimum duration of a triggered overheatStatus

Do you have a wiring diagram? I'm a bit confused over the comments about multiplexing the analog inputs for the temperature sensors, from your code it doesn't really appear that you are trying to read more than one sensor per analog input.

cattledog:
EDIT: If you disconnect the solenoids and heaters, do you still get the crashes?

Could very well be the relays causing problems, I had not noticed there were relays yesterday, was concentrating on all the warnings the compiler was throwing out.