Comparing two bytes - weird behaviour

Hello,

I have spent the last two days trying to resolve the following issue (the solution is probably trivial, but I'm stuck, and googling it brought up zilch):

I have a program reading data in packets of bytes from the serial port. The last byte in the packet is a simple XOR checksum.

      byte checksum = input[1] ^ input[2] ^ input[3];

      if (checksum == input[4])
      {
              /* .... */
      }

Now, the above comparison seems to fail occasionally. Not every time, just about 20% of the time. So, I have put some code into the "else" block to print out both the checksum and the input[4] to an LCD, and they look identical in decimal.

So, I have replaced the comparison with:

      if ((checksum & input[4]) == checksum)

and this seems, bizarrely, to work every time.

WHY? Any ideas welcome. Please save my sanity.

Arut

aruta:
and this seems, bizarrely, to work every time

You have an interesting definition for the word “work”. The “fix” will most certainly not detect checksum failures.

  1. Getting help on the forum
    Post your complete sketch (program code)! If you don’t you waste time while people ask you to do that.

A wild guess - you are referencing the array starting at index 1 but C/Arduino starts array indices at zero. If you declared the array to be "byte input[4];" then the element input[4] isn't valid. If that is the case, store the data in input[0], input[1], input[2] and [3] and compare input[0]^input[1]^input[2] with input[3].

Pete

aruta: WHY?

There is something wrong with your code - probably a part that you didn't show us.

aruta: Now, the above comparison seems to fail occasionally. Not every time, just about 20% of the time.

Probably 20% of the time the first byte (index 0) is zero, thus not affecting the checksum.

cheksum & input[4] can only equal checksum if input[4] is all 1's.

cheksum & input[4] can only equal checksum if input[4] is all 1's.

Not true. For example checksum = 3 and input[4] = 7. checksum & input[4] is then 3 (==checksum). As long as all the bits that are set in checksum are also set in input[4] the test will succeed.

Pete

PeterH:
There is something wrong with your code - probably a part that you didn’t show us.

Ok, granted, it wasn’t the clearest of questions. Also, I did realize today that the “alternative” code I wrote was actually wrong (or at least not completely equivalent, as it wouldn’t catch all checksum errors).

To clarify:

  • array is defined as input[5]
  • input[0] is a fixed value marker byte, and is not included in the checksum. Its value is checked separately, as a first condition, before checksumming [1], [2] and [3]. Byte input[4] contains the checksum

I have played with this again and it does look like it fails only part of the time, almost like arduino is not comparing bytes, but upscaling them to words or something, and then it depends if the upper byte of the word happens to be equal or not. I have again tested this by sending both bytes (checksum and input[4]) to the lcd AND back through the Serial, as well as output from the following test code:

  if (checksum==input[4])
  {
    test = 0;
  }
  else
  {
    test = 255;
  }

.

In all cases I get two identical values back, plus the test value equal to 255 part of the time. It works the same way on both Uno and Mega.

I’m not using interrupts explicitly. As libraries, I’m using OneWire, DallasTemperature, PID_v1 and LiquidCrystal. Is there any possibility something may be influencing the comparison?

Please help!

array is defined as input[5]

That won't compile, so what's the problem with posting code?

AWOL:

array is defined as input[5]

That won’t compile, so what’s the problem with posting code?

No problem, just wanted to extract the relevant bits, plus it’s longer than allowed max post length. Clearly wasn’t enough. If you’d be so kind and have a look at the below…

#include <OneWire.h>
#include <DallasTemperature.h>
#include <PID_v1.h>
#include <LiquidCrystal.h>

//Definitions
#define FAN 11           // Output pin for fan
#define ONE_WIRE_BUS 9  // Temperature Input is on Pin 9
#define LCD_BL 3

#define MODE_STOP 0
#define MODE_RUN 1
#define MODE_MAX 2
#define MODE_CRITICAL 255

#define COL2 10

#define MAXCPUUPDATEINT 10000
#define SCREENUPDATEINT 2000

volatile unsigned int encoder0Pos = 0;  //Encoder value for ISR

LiquidCrystal lcd(2, 8, 4, 5, 6, 7);  //set up LCD

//Setup Temperature Sensor
OneWire oneWire(ONE_WIRE_BUS);
DallasTemperature sensors(&oneWire);

double Setpoint = 50;

double fanRad, fanCase;

double tempCase = 28;
double tempExhaust = 30;
double tempCPU = 90;

//Setup variables
double minRadFan = 70; //%
double maxRadFan = 100; //%
double minCaseFan = 70; //%
double maxCaseFan = 100; //%
double criticalCpuTemp = 80; //degC

unsigned long lastTempRequest = 0;
int  delayInMillis = 0;
DeviceAddress tempDeviceAddress;

unsigned long lastScreenUpdate = 0;
unsigned long lastCpuUpdate = 0;

const double aggKp=4, aggKi=0.2, aggKd=1;                                      //original: aggKp=4, aggKi=0.2, aggKd=1, Aggressive Turning,50,20,20
const double consKp=2, consKi=0.05, consKd=0.25;                                    //original consKp=1, consKi=0.05, consKd=0.25, Conservative Turning,20,10,10
double setKp, setKi, setKd;

PID myPID(&tempCPU, &fanRad, &Setpoint, consKp, consKi, consKd, REVERSE);  //Initialize PID

char input[5];

byte mode = MODE_RUN;
byte lastMode = MODE_RUN;

//DEBUGGING
int errors1 = 0;
int errors2 = 0;
int errors3 = 0;
byte lastChecksum, lastResult, test;
byte numBytes;

void setup()
{  
  // start serial port for temperature readings
  Serial.begin(9600);
  Serial.println("Start");

  //Temperature Setup
  sensors.begin();                    //Start Library
  //  sensors.getAddress(tempDeviceAddress, 0);
  //  sensors.setResolution(tempDeviceAddress, resolution);
  sensors.setWaitForConversion(false);
  sensors.requestTemperatures();

  delayInMillis = 750 / (1 << (12 - 9)); 
  lastTempRequest = millis();

  //PID Setup
  myPID.SetMode(AUTOMATIC);
  myPID.SetOutputLimits((int)(minRadFan * 255.0 / 100.0),255);

  TCCR1B = TCCR1B & 0b11111000 | 0x01;  //adjust the PWM Frequency

  //Setup Pins
  pinMode(FAN, OUTPUT);                   // Output for fan speed, 0 to 255

  setKp = aggKp;
  setKi = aggKi;
  setKd = aggKd;

  pinMode(LCD_BL, OUTPUT);
  digitalWrite(LCD_BL, HIGH);

  //Setup LCD 16x4 and display startup message
  lcd.begin(20, 4);
  lcd.noAutoscroll();
  lcd.print("Fan Controller");
  lcd.setCursor(0,1);
  lcd.print("Starting Up");
  delay(1000);
  lcd.clear();
}  

void loop()
{ 
  numBytes = Serial.available();

  ProcessSerial();

  //Get temperature
  if (millis() - lastTempRequest >= delayInMillis) // waited long enough??
  {
    tempCase=sensors.getTempCByIndex(0);
    sensors.requestTemperatures(); 
    lastTempRequest = millis(); 
  }
  tempExhaust=Thermistor(analogRead(0));       // read ADC and  convert it to Celsius  

  if (millis() - lastCpuUpdate >= MAXCPUUPDATEINT) // waited long enough??
  {
    tempCPU = 99;
  }

  if (millis() - lastScreenUpdate >= SCREENUPDATEINT) // waited long enough??
  {
    //print out info to LCD
    /* code removed for brevity */

    lastScreenUpdate = millis();
  }
  //Compute PID value
  double gap = abs(Setpoint-tempCPU); //distance away from setpoint
  if(gap<5)
  {  
    //Close to Setpoint, be conservative
    myPID.SetTunings(consKp, consKi, consKd);
  }
  else
  {
    //Far from Setpoint, be aggresive
    myPID.SetTunings(setKp, setKi, setKd);
  } 
  myPID.Compute();

  //analogWrite(FAN,255);

  //Write PID output to fan if not critical
  if (tempCPU < criticalCpuTemp)
  {
    mode = lastMode;

    switch(mode)
    {
    case MODE_STOP:
      analogWrite(FAN,0);
      break;
    case MODE_RUN:
      analogWrite(FAN,fanRad);
      break;
    case MODE_MAX:
      analogWrite(FAN,255);
      break;

      //    analogWrite(FAN,Setpoint*2.55);
    }
  }
  else
  {
    lastMode = mode;
    mode = MODE_CRITICAL;
    analogWrite(FAN,255);
  }

}

float Thermistor(int RawADC) {
  const float pad = 10040;                       // balance/pad resistor value, set this to
  // the measured resistance of your pad resistor
  const float thermr = 10000;                   // thermistor nominal resistance

  long Resistance;  
  float Temp;  // Dual-Purpose variable to save space.

  Resistance=((1024 * pad / RawADC) - pad); 
  Temp = log(Resistance); // Saving the Log(resistance) so not to calculate  it 4 times later
  Temp = 1 / (0.001129148 + (0.000234125 * Temp) + (0.0000000876741 * Temp * Temp * Temp));
  Temp = Temp - 273.15;  // Convert Kelvin to Celsius                      

  // Uncomment this line for the function to return Fahrenheit instead.
  //temp = (Temp * 9.0)/ 5.0 + 32.0;                  // Convert to Fahrenheit
  return Temp;                                      // Return the Temperature
}

void ProcessSerial()
{
  int count;

  if (Serial.available() < 5)
  {
    return;
  }

  //Read buffer
  count = Serial.readBytes(input, 5);

  if (count == 5)
  {
    //Check for start of Message
    if(input[0] == 0x47)
    {
      //verify checksum
      byte checksum = input[1] ^ input[2] ^ input[3];

      if ((checksum && 0xFF) == (input[4] & 0xFF))
      {
        //Detect Command type
        switch (input[1]) 
        {
        case 0:
          //Get CPU temp
          tempCPU = GetDouble(input[2], input[3]);
          lastCpuUpdate = millis();
          break;
        case 1:
          SendState(); 
          break;                                    //Send state

        /* code removed for brevity */

        case 7:
          minRadFan = GetDouble(input[2], input[3]);
          myPID.SetOutputLimits((int)(minRadFan * 255.0 / 100.0),(int)(maxRadFan * 255.0 / 100.0));
          break;
        case 8:
          minCaseFan = GetDouble(input[2], input[3]); 
          break;

        case 9:
          maxRadFan = GetDouble(input[2], input[3]);
          myPID.SetOutputLimits((int)(minRadFan * 255.0 / 100.0),(int)(maxRadFan * 255.0 / 100.0));
          break;
        case 10:
          maxCaseFan = GetDouble(input[2], input[3]); 
          break;

        case 100:
          SetMode(MODE_STOP); 
          break;                              //Mode Stop
        case 101:
          SetMode(MODE_RUN); 
          break;                              //Mode Run
        case 102:
          SetMode(MODE_MAX); 
          break;                              //Mode Max

        case 127:
          //Say hello
          Serial.print("ARDUINO");
          Serial.flush();
          break;
        }

        //Clear Message byte
        input[0] = 0;
      }
      else
      {
        //checksum error
        errors1++;
        lastChecksum = input[4];
        lastResult = checksum;
        
        if (checksum == input[4])
        {
          test = 0;
        }
        else
        {
          test = 255;
        }
      }
    }
    else
    {
      //first byte not recognized
      errors2++;
    }
  }
  else
  {
    //message too short
    errors3++;
  }
}

void SendDouble ( double value )
{
  Serial.write((word)(value*100) & 0xFF );
  Serial.write(((word)(value*100) & 0xFF00) >> 8);
}

double GetDouble ( byte b1, byte b2 )
{
  return (double)((int)b1 + (int)b2 * 256.0) / 100.0;
}

void SendState()
{
  SendDouble(tempCase);
  SendDouble(tempExhaust);
  SendDouble((double)fanRad / 2.55);
  SendDouble((double)fanCase / 2.55);
}

void SetMode(byte newMode)
{
  if (mode != MODE_CRITICAL)
  {
    mode = newMode;
  }

  lastMode = newMode;
}
if ((checksum && 0xFF) == (input[4] & 0xFF))

What? Bitwise AND vs. logical AND

checksum is a byte, but input[4] is a char.

Try working with all the same data type - i.e., have the checksum as a char.

Btw., a byte is an unsigned char, so your input can be -128 to 127, but the checksum can be 0 to 255.

Clearly, I was just making a dog's dinner out of it, not thinking any more (like mixing logical and bitwise or).

It was, indeed, just a case of mixing unsigned with signed types, and the ca 20% of the time when it went awry was indeed the times when the checksum exceeded the value of 127.

Sincere thanks to all who helped.

I have found that the compiler does screwy things with bytes when you put them in formulas in if statements. Sometimes it converts them to ints. However, it works OK when I compare a byte variable with a constant.

Since the checksum is a straight xor, and xoring something with itself results in a zero, you could try this:

byte checkSum = input[1] ^ input[2] ^ input[3] ^ input[4];

if (checkSum == 0)  // Does the checksum match?
{
...
}

Now do you see why we ask you to post all your code as early as possible?

If you compare a signed and an unsigned value of the same base type (char and unsigned char) the compiler will up-scale the variables to the next size up for the comparison. That is so that both values will fit into the same type. The smallest variable type that will take the numbers -128, 127, 0 and 255 is an int. So, it converts them both to int and compares the two int values.

This is called "variable promotion".

C++ Programming Wikibook:

A numeric promotion is the conversion of a value to a type with a wider range that happens whenever a value of a narrower type is used. Values of integral types narrower than int (char, signed char, unsigned char, short int and unsigned short) will be promoted to int if possible, or unsigned int if int can't represent all the values of the source type. Values of bool type will also be converted to int, and in particular true will get promoted to 1 and false to 0.