Random 1us Delay in SPI Slave Transfers

We’re working on a project that has a SAM E54 microcontroller acting as a SPI master to an Arduino Uno. The Uno controls a CleO50 touchscreen, while the E54 controls all of the math and several outputs. Our goal is to have the user enter all of their parameters into the touchscreen, before pressing a “Go” button and starting the machine process. On pressing the button, the Arduino switches from master to slave mode (this was a fun project in itself), and sends a handful of bytes to the E54. The E54 then performs the necessary tasks using this information before sending a single byte back to the Uno. The return byte is either “the task is finished, let the operator use the screen again” or “there were some errors, display these codes.”

We have 2 errors showing up, one is random, the other is consistent.

The random error occurs during the initial transfer. Roughly 53% of the time, one or more of the bytes encounters 1 or more bits that hangs exactly 1us longer than the clock cycle, causing a misread (i.e. a 61’s second to last bit hangs and causes it to be read as 60). We ran the transfer a couple hundred times, and found any given byte has a 4% chance of experiencing one of these glitches.

The consistent error occurs on the returned byte. No matter what value we send from the E54, it always sends a byte bit shifted left once, plus one (i.e. a 1 shows up as a 3 on the logic analyzer). Since it’s consistent, we can mask it on the Arduino side, but it’s obviously preferred if the data is right from the beginning.

As far as attempted fixes: We are just ran a test to see if using shielded cable gets rid of the problem, no significantly long test has shown any sort of pattern to the glitch, all of the SPI settings on both boards match, we’ve tried 2 different Unos, and since we are getting good data from most bytes, the clock doesn’t seem to be an issue (it’s also only running at 30kHz).

The thing that’s throwing us off is the delayed bit is always off by exactly the same amount every time. A clean 1us. It’s making me think the issue is program based, but given how short the code is and how random the occurrence is, it’s hard to pin down.

I’ll provide the relevant blocks of code from both controllers below, as well as some screenshots from the logic analyzer showing the glitch in question.

Arduino Code:

#define SS_HMI 7

// SPI.CPP uses SS, MOSI, and SCK to define pins. However the declarations don't seem to have pin values
// defined so we define them to make sure everything works.
#define SS 10
#define MOSI 11
#define MISO 12
#define SCK 13

  uint8_t  tx_data[12], rx_data;
  volatile uint8_t pos, done = 0, error = 0;

  bool mode;
  
void setup()
{
  noInterrupts();
  Serial.begin(9600);
  
/* Initialize NerO as master for SPI. Note the following about the SPCR
 * SPIE - Enables the SPI interrupt when 1
 * SPE - Enables the SPI when 1
 * DORD - Sends data least Significant Bit First when 1, most Significant Bit first when 0
 * MSTR - Sets the Arduino in master mode when 1, slave mode when 0
 * CPOL - Sets the data clock to be idle when high if set to 1, idle when low if set to 0
 * CPHA - Samples data on the trailing edge of the data clock when 1, leading edge when 0
 * SPR1 and SPR0 - Sets the SPI speed, 00 is fastest (4MHz) 11 is slowest (250KHz)   
 * 
 * To enable part of the register use the following syntax SPCR |= _BV(MSTR);
 * To disable part of the register use the following syntax SPCR &= ~_BV(MSTR);
 * Replace MSTR in above examples with appropriate bit.
 * 
 * SPI.begin() uses SPCR |= _BV(SPE) and SPCR |= _BV(MSTR) to enable SPI and enable Master mode respectively.
 * it also defines the MOSI, SS, and SCK pins as outputs. See SPI.CPP for more details.
 */
  SPI.begin();
  SPCR |= (0<<_BV(DORD))|(1<<_BV(CPOL))|(0 <<_BV(CPHA))|(1<<_BV(SPR0))|(0<<_BV(SPR1));
  pos = 0;
  mode = true;
  pinMode(4, OUTPUT);
  digitalWrite(4, LOW);

  // test button to simulate HMI. remove in final version.
  pinMode(6,INPUT_PULLUP);

  interrupts();
}

// Interrupt subroutine for SPI transmission. This will fill in the rx buffer and send the tx buffer.
// We will never receive more than one byte from the 54 so rx_data is not an array.
ISR(SPI_STC_vect){
  
  rx_data = SPDR;
  if(rx_data & 0x11){
    done = 1;
  }


    if (!(rx_data & 0x0)){
      SPDR = tx_data[pos++];
        if(pos>11){
          pos = 0;
        }
    }
}

void activate_slave(){
  //disable SPI while changing everything
  SPCR &= ~_BV(SPE);
  
  SPCR &= ~_BV(MSTR); //remove master mode flag
  SPCR |= _BV(SPIE);  //enable interrupts for receiving buffer
  pinMode(SS,INPUT);
  pinMode(MOSI, INPUT);
  pinMode(SCK, INPUT);
  pinMode(MISO, OUTPUT);

  SPCR |= _BV(SPE);

  mode = false;
  Serial.println("slave mode active");
}

void activate_master(){
  //disable SPI while changing everything
  SPCR &= ~_BV(SPE);
  
  SPCR |= _BV(MSTR); //enable master mode flag
  SPCR &= ~_BV(SPIE); //disable interrupts
  pinMode(SS,OUTPUT);
  pinMode(MOSI, OUTPUT);
  pinMode(SCK, OUTPUT);
  pinMode(MISO, INPUT); 

  SPCR |= _BV(SPE);
  mode = true;

  Serial.println("master mode active");
}

void machine_running(){
/* Insert code here to display "machine running screen */


/* Grab numbers from file and put into tx_buffer */
  tx_data[0] = 1; // value for testing only
  tx_data[1] = 11; // value for testing only
  tx_data[2] = 21; // value for testing only
  tx_data[3] = 31; // value for testing only
  tx_data[4] = 41; // value for testing only
  tx_data[5] = 51; // value for testing only
  tx_data[6] = 61; // value for testing only
  tx_data[7] = 71; // value for testing only
  tx_data[8] = 81; // value for testing only
  tx_data[9] = 91; // value for testing only
  tx_data[10] = 101; // value for testing only
  tx_data[11] = 111; // value for testing only

// Activate SPI slave mode because we will need to send the punch data to the 54
  if(mode){
    activate_slave();
    // preload buffer
//    SPDR = tx_data[0];
  }

// tell 54 we are ready for transfer by sending the "ready signal"
  digitalWrite(4,HIGH);

  while(!error && !done){}

  if (done){
    Serial.println("machine done");
    done = 0;
  }

  if(error){
    Serial.println("error happened");
    error = 0;
  }

  digitalWrite(4,LOW);
}


void loop()
{
do{
}while(digitalRead(6));
  machine_running();
do{
}while(!digitalRead(6));

E54 Code:

while (true)
 {
 wdt_feed(&WDT_0);

 if(!machine_status)
 {
 //Wait for HMI "Go"
 if(gpio_get_pin_level(TESTIN) == true)//Run) // for testing purposes. Use "Run" in final version.
 {
 Run = 0;
 delay_ms(1000);
 //Get new punch values from HMI
 ReceiveHMI(&Profile);
 
 
 //Initiate a hatch cycle
// Cycle(&Profile);
// FeedCyc++;
// do
// {
// wdt_feed(&WDT_0);
// } while (gpio_get_pin_level(MotorDone) == 0);

 machine_status |= 0x80;
// if(Safety == 0 && FeedCyc >= Profile.fcyc){
 //Index the media
// Feed(&Profile);
// FeedCyc = 0;
// }
 delay_ms(250);
 }
 }
 
 // if done bit is set, send information to HMI and clear the bit to wait for next "go" press.
 if (machine_status & 0x80)
 {
 uint8_t buf[1] = {0x80};
 struct io_descriptor *io;
 spi_m_sync_get_io_descriptor(&SPI_1, &io);
 //
 spi_m_sync_enable(&SPI_1);

 //
 gpio_set_pin_level(SPARE16,false);
 ////
 io_write(io,buf,1);
 //
 delay_us(10);
 gpio_set_pin_level(SPARE16,true);
 //
 spi_m_sync_disable(&SPI_1);
 
 machine_status &= ~0x80;
 }
 
// if(machine_status & 0xEF){ //machine status & 0111 1111. machine will send stop command if any errors are present.
 //Clear all outputs and stop machine
// Stop();
// }



 }

Recieve HMI Function on E54:

int ReceiveHMI(struct Punch *prof)
{
 // Buffer to receive
 uint8_t read_buf[13]={0}, tx_buf[1]={'a'};
 uint32_t length = 13;
 
 struct io_descriptor *io;
 spi_m_sync_get_io_descriptor(&SPI_1, &io);
 
 spi_m_sync_enable(&SPI_1);
// spi_m_sync_set_mode(&SPI_1,3);
// spi_m_sync_set_data_order(&SPI_1,SPI_DATA_ORDER_MSB_1ST);
 
 //Transfer punch parameters to the E54
 gpio_set_pin_level(SPARE16,false);

 io_read(io, read_buf, length);
 
 // delay so SPI has time to recognize last byte sent.
 delay_us(10);
 gpio_set_pin_level(SPARE16,true);
 
 // this is a band-aid to get things working. The slave is adding 0b1000 0000 to each of the transmitted numbers. I don't know why.
 // this for loop effectively masks the erroneous bit and gives us the values we want.
 for (uint8_t i=0;i<length;i++)
 {
 read_buf[i] &= 0xEF;
 }
 
 prof->OD = read_buf[1];
 prof->theta = read_buf[2];
 prof->X = read_buf[3]; 
 prof->BTI = read_buf[4];
 prof->BTH = read_buf[5];
 prof->NTI = read_buf[6];
 prof->NTH = read_buf[7];
 prof->MF = read_buf[8];
 prof->fcyc = read_buf[9];
 
 spi_m_sync_disable(&SPI_1);
 return 0;
}

We’re working on a project that has a SAM E54 microcontroller acting as a SPI master to an Arduino Uno. The Uno controls a CleO50 touchscreen, while the E54 controls all of the math and several outputs. Our goal is to have the user enter all of their parameters into the touchscreen, before pressing a “Go” button and starting the machine process. On pressing the button, the Arduino switches from master to slave mode (this was a fun project in itself), and sends a handful of bytes to the E54. The E54 then performs the necessary tasks using this information before sending a single byte back to the Uno. The return byte is either “the task is finished, let the operator use the screen again” or “there were some errors, display these codes.”

Horrible setup. I hope you were forced to build such “machine” as the architecture is a good example of how you shouldn’t do it.

The consistent error occurs on the returned byte. No matter what value we send from the E54, it always sends a byte bit shifted left once, plus one (i.e. a 1 shows up as a 3 on the logic analyzer). Since it’s consistent, we can mask it on the Arduino side, but it’s obviously preferred if the data is right from the beginning.

In this case the naming of the signals in the logic analyzer is wrong. MOSI is SS, MISO is MOSI and SCKL is SCKL. It that correct? If I am wrong, post a wiring diagram of your setup where the wires are labelled correctly (according to the names in the logic analyzer).

The thing that’s throwing us off is the delayed bit is always off by exactly the same amount every time. A clean 1us. It’s making me think the issue is program based, but given how short the code is and how random the occurrence is, it’s hard to pin down.

My guess at the moment is a hardware problem but I’m not sure I make a correct picture of your setup.

And post more code from the ES4 side. ReceiveHMI() is missing for example.