I'm in a situation with a custom PCB, already fabricated, which has traces in place to route signals to/from an arduino (as a slave device) to a non-arduino master. As the design already exists and I hadn't thought of using SPI earlier on these traces are, by bad luck, attached all to pins other than the 10,11,12,13 digital pins* of the arduino (uno, in the form of an atmega328p. *I've quoted those pins in arduino pin numbering, on the board the traces are not on the DIL chip pins which correspond to those arduino pins). Also I use 10,11,13 for their PWM function so couldn't spare them for SPI anyway.
But I know that SPI can be bit banged, according to Gammon's site it can get up to about 52microSecs per byte with an example library he wrote (the faster version at Gammon Forum : Electronics : Microprocessors : SPI - Serial Peripheral Interface - for Arduino ). This library of his can do SPI on any four pins. I am not worried by this loss of speed as compared to what the hardware SPI interface can do, though would rather not get too much slower than this. But I think his library is only appropriate to arduinos as SPI masters rather than as slaves. Is there any way to use this kind of implementation for having an arduino as an SPI slave to another device, but using bit banging to do SPI on 4 pins of my choice rather than requiring use of 10,11,12,13 to do SPI with the atmega328p's hardware interface.
Forget libraries if you need top speed. Even with highly optimized code you probably won't reach more than about 100kHz SPI speed. Do you have full control over that other MCU to limit the SPI frequency to that value. And keep in mind that the ATmega won't do anything else while it communicates using a bit-banged SPI slave interface as it will be busy handling all the interrupts and bits going out and arriving.
Entering an interrupt handler needs about 3.5µs on a 16MHz AVR, the exit from that handler needs another 2µs, so you need 5.5µs without having handled any information yet. Getting the bit handling and port settings/readings into another 6µs is challenging enough (so digitalWrite() is a no-go for example) and then you reached only 100kHz.
pylon:
Entering an interrupt handler needs about 3.5µs on a 16MHz AVR, the exit from that handler needs another 2µs, so you need 5.5µs without having handled any information yet.
I am particularly curious to know how this figure of 3.5µs comes about. According to my opinion, the worst case time to arrive at the ISR should be (approx.):
Time to finish the current instruction: 2 cycles
Time to push the return address onto stack : 2 cycles
Time to jump at vector address : 2 cycles
Time to arrive at the (re-direction) at the actual ISR : 2 cycles
If you disable other interrupts the worst case interrupt latency will be low. If you write the ISR in ASM and possibly use dedicated registers you can make it very fast. Hundreds of kHz surely. It all depends how much resources you are willing to sacrifice. And if the communication may be blocking. If it must be non-blocking and if other interrupts are active (i.e. millis) it will be slow and it will be very difficult to find worst case frequency that will surely work.
Ok, so just for a moment getting away from the exact calcaultion of speed limits for this, can anyone suggest a software SPI library which can cope with being the slave in a bus? I pointed out gammon's master example, and if what i end up with as a slave turns out to be somewhat slower then it's not too tragic, can anyone suggest one which would work as a slave too?
Wrong, it's just the time you need if you use the standard C interface (which saves many registers) to it as most people using the Arduino IDE do. I agree that you might reach faster responses if you code your handler in assembler and optimize it to use only one or two registers but that's not feasible for coding an SPI slave emulation library. I doubt that anyone would invest that much effort into such a solution instead of just switching hardware.