PID Library Output goes NaN randomly

I've been trying to use the popular PID library for a simple device to heat a piece of copper. I'm using a mechanical relay for switching so I've started by using the example code for relay output. The only additions I've made are to initialize an Adafruit thermocouple breakout, input a fixed setpoint of 37, and added a Serial print to monitor the current input and output values.

Initially, things seemed to be working fine and I've just been charting the temperature and gradually increasing Kp while leaving Ki and Kd at 0 which I think is fairly standard practice for manual tuning a new PID system. Eventually though as Kp got up around 100 the Output started going NaN and the relay would just stay on and skyrocket well past the setpoint. The Input print would also show NaN occasionally, but more frequently the Output goes NaN.

The regular behavior of the PID when the NaN error isn't occurring also seems a bit off. Even with a Kp as high as 80 the controller won't quite reach the 37C setpoint, and takes 5 minutes or more to go from room temp to 30C, and at no point does the "on" portion of the 5 second window exceed 800 - 900 ms, and is at it's highest on launch and immediately begins tapering down.

I'm not posting a schematic here because I know the wiring is fine - I'm comfortable with that side of things, I just don't do a lot of programming and I'm not sure how to go about debugging further.

#include <SPI.h>
#include "Adafruit_MAX31855.h"
#include <PID_v1.h>

#define PIN_INPUT 0
#define RELAY_PIN 13

#define MAXDO   3
#define MAXCS   4
#define MAXCLK  5

Adafruit_MAX31855 thermocouple(MAXCLK, MAXCS, MAXDO);

//Define Variables we'll be connecting to
double Setpoint, Input, Output;

//Specify the links and initial tuning parameters
double Kp=30, Ki=0, Kd=0;
PID myPID(&Input, &Output, &Setpoint, Kp, Ki, Kd, DIRECT);

int WindowSize = 5000;
unsigned long windowStartTime;

void setup()
{
  windowStartTime = millis();

  //initialize the variables we're linked to
  Setpoint = 37;

  //tell the PID to range between 0 and the full window size
  myPID.SetOutputLimits(0, WindowSize);

  //turn the PID on
  myPID.SetMode(AUTOMATIC);

  pinMode(RELAY_PIN, OUTPUT);

  Serial.begin(9600);
}

void loop()
{
  Input = thermocouple.readCelsius();
  myPID.Compute();


  if (millis() - windowStartTime > WindowSize)
  { //time to shift the Relay Window
    windowStartTime += WindowSize;
    Serial.print("Out: ");
    Serial.println(Output); 
    Serial.print("In: ");
    Serial.println(Input);
  }
  if (Output < millis() - windowStartTime) digitalWrite(RELAY_PIN, LOW);
  else digitalWrite(RELAY_PIN, HIGH);

}

An Example of Output/Input values with Kp 30 and setpoint 37 from the print console (prints every 5s):

Out: 187.50
In: 30.75
Out: 187.50
In: 30.75
Out: 187.50
In: 30.75
Out: 187.50
In: 30.75
Out: 187.50
In: 30.75
Out: 187.50
In: 30.50
Out: 195.00
In: 30.50
Out: 195.00
In: 30.50
Out: 195.00
In: 30.50
Out: 195.00
In: 30.50
Out: 195.00
In: 30.50
Out: 202.50
In: 30.25
Out: 195.00
In: 30.50
Out: 202.50
In: 30.25
Out: 202.50
In: 30.25
Out: 202.50
In: 30.25
Out: 202.50
In: 30.25
Out: 202.50
In: 30.25
Out: 210.00
In: 30.00
Out: 210.00
In: 30.00
Out: 210.00
In: 30.00
Out: 210.00
In: 30.00
Out: 210.00
In: 30.00
Out: 210.00
In: 30.00
Out: 210.00
In: 30.00
Out: 217.50
In: 29.75
Out: 217.50
In: 29.75
Out: 217.50
In: 29.75
Out: 217.50
In: 29.75

As you can see the performance is pretty flaccid with a plenty powerful heater and what I would think is a relatively high Kp. Doubling the Kp, when things aren't NaN'ing out, only improves this modestly.

I would suspect that the problem is the input value.

The PID library shouldn't be dividing by any variable that could be zero (I haven't checked this).

The thermocouple library does return NAN when there is a problem.

Try checking and printing the thermocouple measurement before updating the PID controller.

Pieter

Thanks for the input - I will try that out. If by chance Im getting random NaNs from the thermocouple, perhaps from polling it too frequently, what is the preferred way to check for NaNs and prevent them from being fed into the PID function?

Would something like this be the preferred way to do it:

double check = thermocouple.readCelcius();

if (isNaN(check))// I know this particular function requires the math library
{
//some response to the error
}
else
{
Input = check;
}

Also - is it suspicious to you that the output starts so low and almost immediately starts dropping even? In other applications a Kp at 80 would be pretty aggressive...here I just see heating for 800 ms of the 5000 ms window max, and this pretty quickly drops down to half that and never even reaches setpoint.

Yes that is a good way to check for NaN. However, you could also check for a valid range: if (t >= -20.0 && t <= 100.0). Comparing NaN to anything will result in false.

It all depends on your system's dynamics and the error you have. With a binary output, it's going to be pretty hard to tune.
A simple controller with hysteresis might be easier and more appropriate than a PID controller.

I'm not sure how to go about debugging further.

The best approach is to liberally sprinkle the code with Serial.print statements, to see the values of ALL the variables, intermediate calculations and results during operation, not just at timed intervals.

is it suspicious to you that the output starts so low and almost immediately starts dropping even?

Yes, which is why you put in Serial.print statements.

takes 5 minutes or more to go from room temp to 30C

What happens if the heater is fully on? You have left out all the important circuit details, required to understand if you have even constructed the system correctly.

How are you isolating the extremely sensitive electrical thermocouple connection from the heater and the copper?

At least the Output of the PID looks correct. For example:
Setpoint = 37.0;
Input = 30.0;
error = 7.0;
Kp = 30.0;
Output = 210.0 . (= 30.0 * 7.0)

jremington:
The best approach is to liberally sprinkle the code with Serial.print statements, to see the values of ALL the variables, intermediate calculations and results during operation, not just at timed intervals.
Yes, which is why you put in Serial.print statements.
What happens if the heater is fully on? You have left out all the important circuit details, required to understand if you have even constructed the system correctly.

How are you isolating the extremely sensitive electrical thermocouple connection from the heater and the copper?

I suppose I could print each loop or each time Input is set to a new value, but since most of the work is happening behind the scenes in the library, and presumably, considering its popularity, the problem isn't within the library, I'm not sure where exactly I should be sprinkling in prints. If the only terms changing are input and output it seems all I could do is print those two more frequently?

When the relay is on continuously the plate being warmed can reach the setpoint in a little over a minute, and can double the setpoint of 37C in ~3 or 4 minutes.

The thermocouple and amp have had decoupling caps added, and receives clean power from a separate power supply than the heater. The heaters are 28V DC kapton thin-film heaters from omega, so not terribly powerful or noisy relative to other more serious elements. The thermocouple is a fine-wire thermocouple attached and insulated from the copper substrate with Kapton tape ( the bead does not make direct contact with the copper as I am aware that would screw up the reading ). Again, I've worked with all the elements of this system quite a bit except the library as I generally just use off-the-shelf PIDs from Omega and the like. I don't see any indication of a problem with the circuit..the relay switches reliably, the heater heats, the power supplies put out clean regulated power, and the thermocouple provides stable enough readings that do not fluctuate when power is applied/removed from the heater.

johnwasser:
At least the Output of the PID looks correct. For example:
Setpoint = 37.0;
Input = 30.0;
error = 7.0;
Kp = 30.0;
Output = 210.0 . (= 30.0 * 7.0)

Everything was behaving more or less as expected until I started getting up in the 90 -100 range for Kp which kicked off all the NaN stuff. I thought maybe there was an unwritten upper limit to Kp, like maybe some calculation behind the scenes overflows with a value that large?

You have the source for the library, so put print statements "behind the scenes" and find out.

Putting in a check to prevent NaN input values seems to have cleared up the issue now, and it would also seem my Kp is just that high...Im up around 500 now and finally in the ballpark.

@jremington: Should it become necessary later, and because it sounds like a useful tidbit, how would one go about adding new print statements to the library? Is it just a matter of opening a file in the library and inserting print statements (I've not ever dug into a library, just added them with the "manage libraries" feature in the IDE and ran...)? Will they print straight to the console without any call or extra code from the main loop?

cosmicbackground:
Is it just a matter of opening a file in the library and inserting print statements (I've not ever dug into a library, just added them with the "manage libraries" feature in the IDE and ran...)? Will they print straight to the console without any call or extra code from the main loop?

There's nothing magical about libraries. You can change them just as you please and as long as there's a Serial.begin somewhere in your code, you can use serial.print to debug.

It's usually worth taking a copy of the library so you can restore the pristine version once you're done debugging.

The PID_v1 library is very simple. Don't hesitate to take a look at the code and insert Serial.prints wherever you like.

In PID_v1.cpp, all the calculations are done as floats (same as double on AVR Arduinos), so I don't expect that the value of Kp is causing the problem.

Excellent - thanks all for your input! This has all been very helpful. Glad to know that digging in and modifying libraries won't require any special, additional compilers or anything like that.