UNO-to-UNO SPI Performance Question

I have two UNOs connected via SPI. Clock is configured at 4MHz, fosc/4. I have to insert a delayMicroseconds(20) in order for this code to work. The ATmega328P states max slave speed is 4MHz, but that is only SCK during bit transfer. A byte is transmitted in 2us (8*0.25us per bit), so having to pause 20us between bytes is absurd, that's a 10x performance reduction. Clearly I'm missing something obvious. How can I increase my SPI transfer rate to a consistent 500kB/s?

Thanks in advance. Here is my very simple code.

Master code:

#include "SPI.h"
void setup() {
  Serial.begin(115200);
  SPI.begin();
  pinMode(MISO, INPUT);
  digitalWrite(MISO, HIGH);
}
byte buf[16];
void loop() {
  SPI.beginTransaction(SPISettings(4000000, MSBFIRST, SPI_MODE0));
  digitalWrite(SS, LOW);
  // queue the first byte in the slave register
  SPI.transfer(0);
  delayMicroseconds(20);
  for (int i = 0; i < 16; ++i) { // read 16 bytes back (incl. queued byte)
    buf[i] = SPI.transfer(1);
    delayMicroseconds(20);
  }
  digitalWrite(SS, HIGH);
  SPI.endTransaction();
  for (int i = 0; i < 16; ++i) { // dump to screen
    Serial.print(buf[i], HEX);
    Serial.print('-');
  }
  Serial.println();
  delay(500);
}

Slave code (note the cli()/sei() are vital otherwise I see massive corruption):

#include "SPI.h"
byte buf[16];
int i;
void setup() {
  // not sharing the bus
  pinMode(MISO, OUTPUT);
  SPCR |= _BV(SPE);
  SPI.attachInterrupt();
  for (i = 0; i < 16; ++i) { // create an xmit buffer
    buf[i] = 0xa0 + i;
  }
  i = 0;
}
ISR(SPI_STC_vect) {
  cli();
  byte c = SPDR;
  if (c == 0) {
    i = 0;
  }
  SPDR = buf[i];
  ++i;
  if (i >= 16) {
    i = 0;
  }
  sei();
}
void loop() {
}

It may be worth trying SPI without using the library. It is actually very simple.

...R

"It may be worth trying SPI without using the library. It is actually very simple."

Super helpful. Thanks. /s

You don't have to (and shouldn't?) put cli() and spi() in your ISR since that is done automatically by the interrupt hardware.

pjt:
"It may be worth trying SPI without using the library. It is actually very simple."

Super helpful. Thanks. /s

You are welcome.

On the other hand if you did not understand what I was suggesting, or how to do it, all you have to do is ask.

...R

Saleae time.

http://jeelabs.org/2011/12/24/trying-to-improve-on-the-spi-bus/

mrburnette:
http://jeelabs.org/2016/09/diving-deep-into-spi/

Mind your head. It is not very deep :slight_smile:

...R

Why is the SPI library bad? What did YOU learn while you were exploring this issue? Have you even explored what I'm trying on your own? Share your knowledge, and if you have none, why say anything?

There's very, very little code in the SPI.cpp/.h files (~400 lines), and most of it is good comments. I've read through it and would do exactly the same thing if I were starting from scratch. It's basically a little bit of control register setup, and then a bunch of overloads for sequencing bytes out to the data register. The SPI section(s) in the ATmega328p datasheet outline SPI and the timing characteristics on the IO side, but doesn't talk about if there are any issues accessing the control register too quickly from the MCU side.

Maybe there's a limit on how fast the MCU can write to MMIO, or maybe there's a bit i need to check on the host side SPI controller that indicates when it is OK is ready to queue and I am overflowing because I'm not waiting for it to drain... So the question remains, why is a ~20us delay needed when writing bytes to the SPI data register?

pjt:
Why is the SPI library bad?

I have no reason to think it is bad. But I had the impression from your Original Post that it is not doing what YOU want.

...R