Attached is a software SPI library I wrote (just for the practice) and would be grateful if anybody who tries it out post the details of if/how it performed and one what platform, what device they were talking to, details like bit order, clock speed & data mode.
Being software based mean you should be able to use any/most pins for CLK, MOSI & MISO so when the class is defined you specify the pins required.
It has all the same functions as the hardware SPI library so should work as a drop in replacement. The only limitation is the clock divider. Being software based I only support clock speeds of SPI_CLOCK_DIV32, SPI_CLOCK_DIV64 & SPI_CLOCK_DIV128
I have currently only tested on a 23LC1024 SRAM chip (worked fine)
MSBFIRST
SPI_CLOCK_DIV32
SPI_MODE0
EDIT: v0.3 I have done some improvements to speed up the code and it's attached here. I have also attached logic analyzer that shows transfer with the 23LC1024 SRAM chip.
think you should explicitly set output lines to LOW I think
// To keep the port transfer stable and fast the bits are always rotated out in the same direction
// so if the bit order needs changing then do it now. *** Needs a faster routine
if(_bitOrder == MSBFIRST) { // Need to swap bit order?
tmpByte = 0;
for(i = 0; i < 8; i++){
tmpByte |= ((_data >> i) & 1) << (7 - i);
}
_data = tmpByte;
}
// Call relevent routine to transfer data. Written as separate routines to keep speed up
// as a single routine would slow transfer down with extra checking.
if that is so Why not write a dedicated routines for MSBFIRST and LSBFIRST instead of reversing the byte?
That said, writing such a lib is not trivial and it is easier to react on it than to write one oneself, so stay challenged to improve it !
(in other words: well done
think you should explicitly set output lines to LOW I think
Have done.
// To keep the port transfer stable and fast the bits are always rotated out in the same direction
// so if the bit order needs changing then do it now. *** Needs a faster routine
if(_bitOrder == MSBFIRST) { // Need to swap bit order?
tmpByte = 0;
for(i = 0; i < 8; i++){
tmpByte |= ((_data >> i) & 1) << (7 - i);
}
_data = tmpByte;
}
There is a whole discussion here - http://forum.arduino.cc/index.php?topic=203098.0 - with way faster code
(you should not be afraid of a "bit magic"
personal favorite
byte reverseX(byte x){
x = (((x&0xaa)>>1) + ((x&0x55)<<1));
x = (((x&0xcc)>>2) + ((x& 0x33)<<2));
return((x>>4) + (x<<4));
}//reverse
to understand take paper and pencil and be the processor...
I had a remmed out version of the 16 byte lookup table in the code but had not gotten round to testing it (pulled from www)
robtillaart:
// Call relevent routine to transfer data. Written as separate routines to keep speed up
// as a single routine would slow transfer down with extra checking.
if that is so Why not write a dedicated routines for MSBFIRST and LSBFIRST instead of reversing the byte?
Yes it does peeve me a bit that MSBFIRST takes a double hit (before and after transfer) but it's the age old trade-off between speed and compact code. It would be nice to know how many bytes highCPHA & lowCPHA functions consume to see if the byte overhead of writing 2 extra routines is worth it.
I had also considered writing data out one way and reading back the opposite so one bit swap was needed but for both MSB & LSB (balancing the current time difference).
robtillaart:
// Call relevent routine to transfer data. Written as separate routines to keep speed up
// as a single routine would slow transfer down with extra checking.
if that is so Why not write a dedicated routines for MSBFIRST and LSBFIRST instead of reversing the byte?
Yes it does peeve me a bit that MSBFIRST takes a double hit (before and after transfer) but it's the age old trade-off between speed and compact code. It would be nice to know how many bytes highCPHA & lowCPHA functions consume to see if the byte overhead of writing 2 extra routines is worth it.
I had also considered writing data out one way and reading back the opposite so one bit swap was needed but for both MSB & LSB (balancing the current time difference).
From a user point of view I prefer the library to be equally fast for MSBFIRST and LSBFIRST. There is no (known to me) argument which of the two should be the fastest.
If you really want squeeze performance one could make 2 libraries one MSB shiftout and one LSB shiftout. I expect 99% of the sketches will only use one of the two.
robtillaart:
From a user point of view I prefer the library to be equally fast for MSBFIRST and LSBFIRST. There is no (known to me) argument which of the two should be the fastest.
I have decided to pull the library for further work. I will address the MSBFIRST time hit by rotating in in the other direction or adding separate functions.
IIRC that gives the compiler some means to optimize it (in line) before linking,
You should keep Serial.print() statements like this out of the library as they can mess up the serial communication in a program.
Or make the conditional by wrapping a #ifdef around it.
#ifdef DEBUG
Serial.println("SPI_MODE0");
#endif
in the top of the file you state #define DEBUG 1 to enable debug
Catch runtime errors in the switches!
void SPIsoft::setClockDivider(uint8_t clockDivider){
switch (clockDivider){
case SPI_CLOCK_DIV32:
_clkDelay = 1;
break;
case SPI_CLOCK_DIV64:
_clkDelay = 2;
break;
case SPI_CLOCK_DIV128:
_clkDelay = 3;
break;
default: // all wrong values are mapped upon SPI_CLOCK_DIV32
_clkDelay = 1;
break;
}
}
if (*_misoPort & _misoMask){ // Read the miso port bit
tmpByte |= j;
}
else{
tmpByte &= ~j; // isn't ~j just 0 ?
}
this if then else can be done simpler I guess as only the if part does something
if (*_misoPort & _misoMask){ // Read the miso port bit
tmpByte |= j;
}
robtillaart:
You should keep Serial.print() statements like this out of the library as they can mess up the serial communication in a program.
Or make the conditional by wrapping a #ifdef around it.
#ifdef DEBUG
Serial.println("SPI_MODE0"); #endif
Mistake on my behalf, forgot to remove them after testing. :blush:
Catch runtime errors in the switches!
void SPIsoft::setClockDivider(uint8_t clockDivider){
switch (clockDivider){
case SPI_CLOCK_DIV32:
_clkDelay = 1;
break;
case SPI_CLOCK_DIV64:
_clkDelay = 2;
break;
case SPI_CLOCK_DIV128:
_clkDelay = 3;
break;
default: // all wrong values are mapped upon SPI_CLOCK_DIV32
_clkDelay = 1;
break;
}
}
When the library is defined it sets a default of SPI_CLOCK_DIV64. As passing an unknown value should have no effect I prefer to not alter a current speed setting. IMO this could encourage sloppy programming if passing an unknown value gives predictable changes, better to not change anything or return an error.
if (*_misoPort & _misoMask){ // Read the miso port bit
tmpByte |= j;
}
else{
tmpByte &= ~j; // isn't ~j just 0 ?
}
this if then else can be done simpler I guess as only the if part does something
But then you alter the timings between 0 & 1 bits.
you could precalculate the _inverted_mosiMask = ~_mosiMask; making the expressions time more equally (to be tested)
I have not bothered trying to look at machine code level but assumed the ~ would equate to a single clock cycle instruction.
2 cents
Thanks for your insights and suggestions.
When the library is defined it sets a default of SPI_CLOCK_DIV64. As passing an unknown value should have no effect I prefer to not alter a current speed setting. IMO this could encourage sloppy programming if passing an unknown value gives predictable changes, better to not change anything or return an error.
OK, good choice too,
you can make that explicit in the code with an empty default and the above comment why.
Hi, i've made an small modification on transmit method. I wanted to control how many bits it send on transmit.
An example, i want to send 0x06 or only "110" bits trought SPI. Using MSB (default method) I can do this:
byte a = 0x06;
a = a<<5;
SPI.transmit(a,3);
And it send's to me only the three bits i want.
Not sure if its ok, but worked for me. If you want to revise my code, im sharing it here.