Master - Slave SPI stops running after a while

I found some great code on a very useful and informative website, and set about trialling it for my project. I’ve only been programming arduino for a few weeks, and so far I’ve found code either works 100% or requires correction.

However, this is different. The master-slave code runs on two arduino nanos for anything from less than 100 iterations to several 1000 iterations. Then it mysteriously stops, and the only message in serial monitor is ovf, which I assume means overflow. If I hit the slave reset button it runs again but then stops.

How do I begin to work out which bit is overflowing, why, and what needs changing to make it run forever?

Master:

// http://www.gammon.com.au/forum/?id=10892
// master

#include <SPI.h>
#include "SPI_anything.h"

// create a structure to store the different data values:
struct myStruct
{
  float a;
  float b;
  unsigned long c;
  float d;
  float e;
  float f;
  float g;
  float h;
};

myStruct foo;

void setup ()
  {
  SPI.begin ();
  // Slow down the master a bit
  SPI.setClockDivider(SPI_CLOCK_DIV8);

  foo.a = -4.27;
  foo.b = 150.89;
  foo.c = 0;
  foo.d = -12.84;
  foo.e = -500.32;
  foo.f = 1000.23;
  foo.g = -333.33;
  foo.h = -0.0001;
  }  // end of setup

void loop () 
  { 
  digitalWrite(SS, LOW);    // SS is pin 10
  SPI_writeAnything (foo);
  digitalWrite(SS, HIGH);
  delay (100);  // for testing  
  
  foo.c++;
  }  // end of loop

Slave:

// http://www.gammon.com.au/forum/?id=10892
// slave

#include <SPI.h>
#include "SPI_anything.h"

// create a structure to store the different data values:
struct myStruct
{
  float a;
  float b;
  unsigned long c;
  float d;
  float e;
  float f;
  float g;
  float h;
};

volatile myStruct foo;
volatile bool haveData = false;

void setup ()
  {
  Serial.begin (115200);   // debugging

  // have to send on master in, *slave out*
  pinMode(MISO, OUTPUT);
  
  // turn on SPI in slave mode
  SPCR |= _BV(SPE);

  // now turn on interrupts
  SPI.attachInterrupt();
  
  }  // end of setup

void loop () 
  { 
  if (haveData)
     {
     Serial.println (foo.a);
     Serial.println (foo.b);
     Serial.println (foo.c);
     Serial.println (foo.d);
     Serial.println (foo.e);
     Serial.println (foo.f);
     Serial.println (foo.g);
     Serial.println (foo.h, 4);
     Serial.println ();
     haveData = false;
     }
  }  // end of loop

// SPI interrupt routine
ISR (SPI_STC_vect)
  {
  SPI_readAnything_ISR (foo);
  haveData = true;
  }  // end of interrupt routine SPI_STC_vect

SPI_anything.h Library:

#include <Arduino.h>

template <typename T> unsigned int SPI_writeAnything (const T& value)
  {
    const byte * p = (const byte*) &value;
    unsigned int i;
    for (i = 0; i < sizeof value; i++)
          SPI.transfer(*p++);
    return i;
  }  // end of SPI_writeAnything

template <typename T> unsigned int SPI_readAnything(T& value)
  {
    byte * p = (byte*) &value;
    unsigned int i;
    for (i = 0; i < sizeof value; i++)
          *p++ = SPI.transfer (0);
    return i;
  }  // end of SPI_readAnything
  
  
template <typename T> unsigned int SPI_readAnything_ISR(T& value)
  {
    byte * p = (byte*) &value;
    unsigned int i;
    *p++ = SPDR;  // get first byte
    for (i = 1; i < sizeof value; i++)
          *p++ = SPI.transfer (0);
    return i;
  }  // end of SPI_readAnything_ISR

Then it mysteriously stops, and the only message in serial monitor is ovf,

If you stopped with the anonymous printing, then "the only message in serial monitor is ovf" would NOT be the case. Perhaps a clue just might present itself.

Thanks for your reply but unfortunately I don't understand your suggestion.

Is there a way to enable a more verbose version of error messages in the serial monitor that will give me more than 'ovf'?

Or do I have to write some code to try and output something else that will help me see what is going wrong?

I'm afraid I don't understand what 'anonymous printing' means in this context, so I don't know what I have to do to remove the anonymity and get some insight.

Or do I have to write some code to try and output something else that will help me see what is going wrong?

Can you see the difference between

     Serial.println (foo.a);

and

     Serial.print("foo.a = ");
     Serial.println (foo.a);

?

The first is anonymous printing. Some value appears in the serial monitor, but you have not a clue what the value corresponds to.

The second is not anonymous printing. Some value appears in the serial monitor, preceded by enough information to know what the value corresponds to.

If you change all of your printing to not be anonymous, but still only ovf appears in the serial monitor, then you will know that the data appear because some other print() call made it happen.

Why/when/where the print() happens provides valuable information.

Thanks for explaining that. I've modified my slave code accordingly.

  if (haveData)
     {
     Serial.print("foo.a = ");
     Serial.println (foo.a);
     Serial.print("foo.b = ");
     Serial.println (foo.b);
     Serial.print("foo.c = ");
     Serial.println (foo.c);
     Serial.print("foo.d = ");
     Serial.println (foo.d);
     Serial.print("foo.e = ");
     Serial.println (foo.e);
     Serial.print("foo.f = ");
     Serial.println (foo.f);
     Serial.print("foo.g = ");
     Serial.println (foo.g);
     Serial.print("foo.h = ");
     Serial.println (foo.h, 4);
     Serial.println ();
     haveData = false;
     }

It is all running as expected, until here (I've included one iteration of good output):

foo.a = -4.27 foo.b = 150.89 foo.c = 552 foo.d = -12.84 foo.e = -500.32 foo.f = 1000.23 foo.g = -333.33 foo.h = -0.0001

foo.a = ovf foo.b = ovf foo.c = 398698154 foo.d =

Those last four lines of serial print took several minutes to appear, initially with delays of 5 seconds between characters, and then getting slower and slower until no more was printed.

So what to do? Is there some more diagnostic code I can run to try and find out where the problem is?

Those last four lines of serial print took several minutes to appear, initially with delays of 5 seconds between characters

Between characters? As in, the f printed, and the 5 seconds later, the o printed, and then another 5 seconds before the second 0, so that foo.a took 25 seconds to print?

I found some great code on a very useful and informative website

I see this is from Nick’s page here. I’m a little confused, because one shouldn’t loop inside the ISR, transferring bytes. All the other code is careful to return after one byte, except this template routine:

template <typename T> unsigned int SPI_readAnything_ISR(T& value)
  {
         ...
    for (i = 1; i < sizeof value; i++)
          *p++ = SPI.transfer (0);

These transfers can take a long time, to a CPU. Instead, you need to transfer just one byte and return. You can use a byte counter (like i) that persists across ISRs. Use it and increment it each time one byte is transferred:

template <typename T> unsigned int SPI_readAnything_ISR(T& value,unsigned int &i)
 {
    byte b = SPI.transfer (0);

    if (i < sizeof(value)) { // Is there room?
      byte * p = (byte*) &value;
      p[i] = b;
    }
    i++;
 }

And in use:

unsigned int fooIndex = 0;

void loop ()
{
   if (haveData) {
     Serial.println (foo.a);
        ...
}

// SPI interrupt routine
ISR (SPI_STC_vect)
 {
   SPI_readAnything_ISR (foo,fooIndex); // index passed in...

   if (fooIndex == sizeof(foo))
     haveData = true;
 }  // end of interrupt routine SPI_STC_vect

But there is also a problem with atomicity. Because these interrupts can happen at any time, you must disable them when you operate on data that is used by the ISR: haveData, fooIndex and foo must be guarded:

void loop()
{
   myStruct safeFoo;

   cli(); // disable interrupts for a *short* time
   bool gotData = haveData; // copy the flag
   if (gotData) {
     haveData = false; // clear the flag
     safeFoo  = foo;  // copy the struct
     fooIndex = 0; // reset the index
   }
   sei(); // enable interrupts!

   if (gotData) { // only use the safe flag!
     Serial.println (safeFoo.a); // only use the safe copy!
        ...
     Serial.println ();
   }
}

This will also make loop run as fast as the Serial printing will allow. You can even test if fooIndex overflowed while you were printing the previous foo:

void loop()
{
   myStruct safeFoo;
   unsigned int safeFooIndex;

   cli(); // disable interrupts for a *short* time
   bool gotData = haveData; // copy the flag
   if (gotData) {
     haveData     = false; // clear the flag
     safeFoo      = foo;  // copy the struct
     safeFooIndex = fooIndex;
     fooIndex     = 0; // reset the index
   }
   sei(); // enable interrupts!

   if (gotData) { // only use the safe flag!

     if (safeFooIndex > sizeof(foo)) {
       Serial.print( F("Overrun: dropped ") );
       Serial.print( safeFooIndex - sizeof(foo) );
       Serial.println( F(" bytes!") );
     }

Aaaaaand you have a message framing problem, too. :slight_smile:

Perhaps Nick can chime in here and clarify…

Cheers,
/dev

PaulS: Between characters? As in, the f printed, and the 5 seconds later, the o printed, and then another 5 seconds before the second 0, so that foo.a took 25 seconds to print?

Yes, it all went super slow after the last good output. Initially around five seconds between characters appearing, and then getting even slower for the values.

I pressed the reset button on the slave and it ran for a while, then the same happened again, super slow serial printing and ovf or incorrect values. This time I left it for longer, and it kept printing incorrect values and ovf outputs very slowly. It has been going so slowly that at the time of my last post I thought it had stopped, when in reality it was taking ages to print a value.

The lines below have taken over one hour to print (one good, normal speed iteration is shown, followed by three very slow, bad iterations).

foo.a = -4.27 foo.b = 150.89 foo.c = 2809 foo.d = -12.84 foo.e = -500.32 foo.f = 1000.23 foo.g = -333.33 foo.h = -0.0001

foo.a = 0.00 foo.b = 0.00 foo.c = 2001731 foo.d = 252329984.00 foo.e = ovf foo.f = ovf foo.g = -0.00 foo.h = ovf

foo.a = -0.00 foo.b = 0.00 foo.c = 6180419 foo.d = 252329984.00 foo.e = ovf foo.f = ovf foo.g = -0.00 foo.h = ovf

foo.a = -0.00 foo.b = 0.00 foo.c = 10359619 foo.d = 252329984.00 foo.e = ovf foo.f = ovf foo.g = -0.00 foo.h = ovf

/dev: Cheers, /dev

Thank you very much for your reply. I will have a look through what you've suggested, and see if I can get it to work.

/dev:
Cheers,
/dev

I’ve attempted to implement the changes you suggested, and it compiles, although I had to remove ‘fooIndex’ from the call to ‘SPI_readAnything_ISR’ because it was only expecting ‘foo’.

My problem now is that the slave serial output is completely blank. It will probably take me days to trace through and try and find the problem, if I ever manage to find it.

The updated code is below:

Master

// master

#include <SPI.h>
#include "SPI_anything_02.h"

// create a structure to store the different data values:
struct myStruct
{
  float a;
  float b;
  unsigned long c;
  float d;
  float e;
  float f;
  float g;
  float h;
};

myStruct foo;

void setup ()
  {
  SPI.begin ();
  // Slow down the master a bit
  SPI.setClockDivider(SPI_CLOCK_DIV8);

  foo.a = -4.27;
  foo.b = 150.89;
  foo.c = 0;
  foo.d = -12.84;
  foo.e = -500.32;
  foo.f = 1000.23;
  foo.g = -333.33;
  foo.h = -0.0001;
  }  // end of setup

void loop () 
  { 
  digitalWrite(SS, LOW);    // SS is pin 10
  SPI_writeAnything (foo);
  digitalWrite(SS, HIGH);
  delay (100);  // for testing  
  
  foo.c++;
  }  // end of loop

Slave

// slave

#include <SPI.h>
#include "SPI_anything_02.h"

// create a structure to store the different data values:
struct myStruct
{
  float a;
  float b;
  unsigned long c;
  float d;
  float e;
  float f;
  float g;
  float h;
};

volatile byte fooIndex;

volatile myStruct foo;
volatile bool haveData = false;


void setup ()
  {
  Serial.begin (115200);   // debugging

  // have to send on master in, *slave out*
  pinMode(MISO, OUTPUT);
  
  // turn on SPI in slave mode
  SPCR |= _BV(SPE);

  // now turn on interrupts
  SPI.attachInterrupt();
  
  }  // end of setup

void loop () 
  { 
    
volatile myStruct safefoo;

   cli(); // disable interrupts for a *short* time
   bool gotData = haveData; // copy the flag
   if (gotData) {
     haveData = false; // clear the flag
     safefoo.a  = foo.a;  // copy the struct
     safefoo.b  = foo.b;  // copy the struct
     safefoo.c  = foo.c;  // copy the struct
     safefoo.d  = foo.d;  // copy the struct
     safefoo.e  = foo.e;  // copy the struct
     safefoo.f  = foo.f;  // copy the struct
     safefoo.g  = foo.g;  // copy the struct
     safefoo.h  = foo.h;  // copy the struct
     
     fooIndex = 0; // reset the index
   }
   sei(); // enable interrupts!

   if (gotData) { // only use the safe flag!
//     Serial.println (safeFoo.a); // only use the safe copy!     {
     Serial.print("foo.a = ");
     Serial.println (safefoo.a);
     Serial.print("foo.b = ");
     Serial.println (safefoo.b);
     Serial.print("foo.c = ");
     Serial.println (safefoo.c);
     Serial.print("foo.d = ");
     Serial.println (safefoo.d);
     Serial.print("foo.e = ");
     Serial.println (safefoo.e);
     Serial.print("foo.f = ");
     Serial.println (safefoo.f);
     Serial.print("foo.g = ");
     Serial.println (safefoo.g);
     Serial.print("foo.h = ");
     Serial.println (safefoo.h, 4);
     Serial.println ();
     haveData = false;
     }
  }  // end of loop

// SPI interrupt routine
ISR (SPI_STC_vect)

 {
   SPI_readAnything_ISR (foo); // index passed in...

   if (fooIndex == sizeof(foo))
     haveData = true;
 }  // end of interrupt routine SPI_STC_vect

Library SPI_anything_02

#include <Arduino.h>

template <typename T> unsigned int SPI_writeAnything (const T& value)
  {
    const byte * p = (const byte*) &value;
    unsigned int i;
    for (i = 0; i < sizeof value; i++)
          SPI.transfer(*p++);
    return i;
  }  // end of SPI_writeAnything

template <typename T> unsigned int SPI_readAnything(T& value)
  {
    byte * p = (byte*) &value;
    unsigned int i;
    for (i = 0; i < sizeof value; i++)
          *p++ = SPI.transfer (0);
    return i;
  }  // end of SPI_readAnything
  
  
template <typename T> unsigned int SPI_readAnything_ISR(T& value)
  {
    byte * p = (byte*) &value;
    unsigned int i;
    *p++ = SPDR;  // get first byte
    byte b = SPI.transfer (0);

    if (i < sizeof(value)) { // Is there room?
      byte * p = (byte*) &value;
      p[i] = b;
    }
    i++;
    return i;
  }  // end of SPI_readAnything_ISR

For the slave code, you shouldn’t clear the haveData flag the second time, and safeFoo should not be volatile:

// slave

#include <SPI.h>
#include "SPI_anything_02.h"

// create a structure to store the different data values:
struct myStruct
{
  float a;
  float b;
  unsigned long c;
  float d;
  float e;
  float f;
  float g;
  float h;
};

volatile byte fooIndex;

volatile myStruct foo;
volatile bool haveData = false;


void setup ()
  {
  Serial.begin (115200);   // debugging

  // have to send on master in, *slave out*
  pinMode(MISO, OUTPUT);
  
  // turn on SPI in slave mode
  SPCR |= _BV(SPE);

  // now turn on interrupts
  SPI.attachInterrupt();
  
  }  // end of setup

void loop () 
  { 

   myStruct safefoo;

   cli(); // disable interrupts for a *short* time
   bool gotData = haveData; // copy the flag
   if (gotData) {
     haveData = false; // clear the flag
     safefoo.a  = foo.a;  // copy the struct
     safefoo.b  = foo.b;  // copy the struct
     safefoo.c  = foo.c;  // copy the struct
     safefoo.d  = foo.d;  // copy the struct
     safefoo.e  = foo.e;  // copy the struct
     safefoo.f  = foo.f;  // copy the struct
     safefoo.g  = foo.g;  // copy the struct
     safefoo.h  = foo.h;  // copy the struct
     
     fooIndex = 0; // reset the index
   }
   sei(); // enable interrupts!

   if (gotData) { // only use the safe flag!
     Serial.print("foo.a = ");
     Serial.println (safefoo.a); // only use the safe copy!
     Serial.print("foo.b = ");
     Serial.println (safefoo.b);
     Serial.print("foo.c = ");
     Serial.println (safefoo.c);
     Serial.print("foo.d = ");
     Serial.println (safefoo.d);
     Serial.print("foo.e = ");
     Serial.println (safefoo.e);
     Serial.print("foo.f = ");
     Serial.println (safefoo.f);
     Serial.print("foo.g = ");
     Serial.println (safefoo.g);
     Serial.print("foo.h = ");
     Serial.println (safefoo.h, 4);
     Serial.println ();
     }
  }  // end of loop

// SPI interrupt routine
ISR (SPI_STC_vect)

 {
   SPI_readAnything_ISR (foo); // index passed in...

   if (fooIndex == sizeof(foo))
     haveData = true;
 }  // end of interrupt routine SPI_STC_vect

But regarding the ISR, I think someone else will have to suggest what really happens here:

template <typename T> unsigned int SPI_readAnything_ISR(T& value)
  {
    byte b = SPDR;  // get one byte from SPI device
    // byte b = SPI.transfer (0);    This doesn't seem right...

    // Save the byte if there's room
    if (fooIndex < sizeof(value)) {
      byte *p = (byte*) &value;
      p[fooIndex] = b; // save one byte
    }
    fooIndex++; // get ready for next byte

    return fooIndex; // why?  Not used by caller.  This could be a void function?

  }  // end of SPI_readAnything_ISR

I think SPI.transfer(0) would block, so that can’t be done in an ISR… Any SPI folks reading this?

Cheers,
/dev