I have a function which bit bangs an SPI-like protocol, it currently takes 8us to complete a clocking cycle of "rise,high,drop,low,rise" on the CLK_pin_set pin. I have verified this with an oscilloscope.
I want to try to get it to run faster though, commenting out the delayMicroseconds() calls completely didn't change the speed.
uint8_t shiftBitMask=0b10000000;
for(uint8_t i=0; i<TotalSendLength;i++){//iterates over bytes in message
shiftBitMask=0b10000000;
for(uint8_t j=0; j<8; j++){//iterates over bits in byte
FastDWrite(MOSI_pin_set,(shiftBitMask & SendingArray[i])!=0 ); //write MOSI high (1) if relevant bit is high, write it low otherwise
//delayMicroseconds(1);//this delay might be un-necessary
FastDWrite(CLK_pin_set,1);//slave will read MOSI at this instant, and start setting MISO just after this instant
//delayMicroseconds(us_delay);//adjust length for minimum to still be reliable, gives slave time to switch MISO to correct state to reply
if(FastDRead(MISO_pin_set)){//if MISO reads high
RecdInternal[i]=RecdInternal[i] | (shiftBitMask);
}else{
RecdInternal[i]=RecdInternal[i] & (~shiftBitMask);//anded with not of shiftBitMask
}
FastDWrite(CLK_pin_set,0);
shiftBitMask=shiftBitMask>>1;//shift to next most significant bit
}
}
Functions used:
struct PinInfo {
volatile uint8_t *reg;
uint8_t mask;
};
const PinInfo unoPins[] = {
{&PORTD, (uint8_t) 0b11111110},
{&PORTD, (uint8_t) 0b11111101},
{&PORTD, (uint8_t) 0b11111011},
{&PORTD, (uint8_t) 0b11110111},
{&PORTD, (uint8_t) 0b11101111},
//and so on to save space in this forum post
};
void FastDWrite(uint8_t Pin,uint8_t Val){
if (Val == 0) {
*(unoPins[Pin].reg) &= unoPins[Pin].mask;
} else {
*(unoPins[Pin].reg) |= ~unoPins[Pin].mask;
}
}
uint8_t FastDRead(uint8_t Pin){
switch (Pin){
case 0:
return (PIND & _BV (0)) == 0; // digitalRead (0);
break;//these breaks won't actually run due to the return
case 1:
return (PIND & _BV (1)) == 0; // digitalRead (1);
break;
//and so on to save space in this forum post
}
}
I have already done tests which show that:
while(1){
FastDWrite(4,1);
FastDWrite(4,0);
}
can run at 2.6MHz, so how might I get my more complex bit banging up to a higher speed? For various reasons I don't want to use the hardware SPI pins for this, because it isn't quite SPI and those pins are already in use for other things, I'm interested in understanding howto makethis bit banging faster if at all possible.
I had a go at looking at the .elf assembly of my code, but it was tricky finding anything which indicates where in the assembly corresponds to this section of code (only some functions and variables seem to get labelled there).
But with my fast digital read and write calls taking about 100ns each (from measurements seeing how much adding them in delayed that while(1) example), I struggle to see how each loop of the j=0; j<8 for loop should take a whole 80 times this 100ns time (or a whole 130 or so atmega328p clock cycles in other terms).
Thank you