Yet another software SPI library - Beta testers needed

Attached is a software SPI library I wrote (just for the practice) and would be grateful if anybody who tries it out post the details of if/how it performed and one what platform, what device they were talking to, details like bit order, clock speed & data mode.

Being software based mean you should be able to use any/most pins for CLK, MOSI & MISO so when the class is defined you specify the pins required.

#include "SPIsoft.h"

#define cs 13
#define miso 6
#define mosi 5
#define clk 7
SPIsoft SPI(miso, mosi, clk);

It has all the same functions as the hardware SPI library so should work as a drop in replacement. The only limitation is the clock divider. Being software based I only support clock speeds of SPI_CLOCK_DIV32, SPI_CLOCK_DIV64 & SPI_CLOCK_DIV128

I have currently only tested on a 23LC1024 SRAM chip (worked fine)
MSBFIRST
SPI_CLOCK_DIV32
SPI_MODE0

EDIT: v0.3 I have done some improvements to speed up the code and it's attached here. I have also attached logic analyzer that shows transfer with the 23LC1024 SRAM chip.

SPIsoft_beta_v0.3a.zip (3.17 KB)

Attached is a sample sketch using the library.

#include "SPIsoft.h"
//#include <SPI.h>

// Tested uning a 23LC1024 sRam chip in a 8 pin DIP package
// Test connections are...
// Chip UNO NAME
//  1   13  SS    (Hardware SS Pin (10 or 53) needs to remain output no matter what other pin you may for SS)
//  2   6   MISO
//  3   NC  
//  4   GND Vss
//  5   5   MOSI
//  6   7   SCK
//  7   +5V ~HOLD
//  8   +5V Vcc

#define cs 13
#define miso 6
#define mosi 5
#define clk 7

#define WRITE 2
#define READ  3

SPIsoft SPI(miso, mosi, clk);

const uint32_t ramSize = 0x1FFFF;           // 128K x 8 bit

char buffer1[] = {"ABCDE"};
char buffer2[sizeof(buffer1)];

void setup(){
  Serial.begin(115200);
  SPI.begin();
  delay(2000);
  Serial.println("Setup Stage 1");
  digitalWrite(cs,HIGH);
  pinMode(cs,OUTPUT);
  //SPI.setBitOrder(LSBFIRST);
  Serial.println("\r\nFill Memory with Buffer.");
  digitalWrite(cs,LOW);
  sRam_writeBuffer(0, buffer1, sizeof(buffer1) - 1);
  digitalWrite(cs,HIGH);
}

void loop(){
  delay(2000);
  Serial.println("\r\nRead Buffer.");
  digitalWrite(cs,LOW);
  sRam_readBuffer(0,buffer2,sizeof(buffer2) - 1);
  digitalWrite(cs,HIGH);
  for(byte i = 0; i < sizeof(buffer2) - 1; i++){
    Serial.print((byte)buffer1[i],BIN);
    Serial.print("\t");
    Serial.println((byte)buffer2[i],BIN);
  }
  Serial.println(buffer1);
  Serial.println(buffer2);
}

void sRam_readBuffer(uint32_t address, char *buffer, uint32_t length) {
  setAddressMode(address, READ);
  byte x;
  for (uint32_t i = 0; i < length; i++) buffer[i] = SPI.transfer(0xAA);
}

void sRam_writeBuffer(uint32_t address, char *buffer, uint32_t length) {
  setAddressMode(address, WRITE);
  for (uint32_t i = 0; i < length; i++) SPI.transfer(buffer[i]);
}

void setAddressMode(uint32_t address, byte mode) {
  SPI.transfer(mode);
  SPI.transfer((byte)(address >> 16));
  SPI.transfer((byte)(address >> 8));
  SPI.transfer((byte)address);
}
void SPIsoft::end(){
  // Do nothing
}

think you should explicitly set output lines to LOW I think

  // To keep the port transfer stable and fast the bits are always rotated out in the same direction
  // so if the bit order needs changing then do it now. *** Needs a faster routine
  if(_bitOrder == MSBFIRST) {                   // Need to swap bit order?
    tmpByte = 0;
    for(i = 0; i < 8; i++){
      tmpByte |= ((_data >> i) & 1) << (7 - i);
    }
    _data = tmpByte;
  }

There is a whole discussion here - byte mirror code don't work - Programming Questions - Arduino Forum - with way faster code
(you should not be afraid of a "bit magic"

personal favorite

byte reverseX(byte x){
  x = (((x&0xaa)>>1) + ((x&0x55)<<1));
  x = (((x&0xcc)>>2) + ((x& 0x33)<<2));
  return((x>>4) + (x<<4));
}//reverse

to understand take paper and pencil and be the processor...

// Call relevent routine to transfer data. Written as separate routines to keep speed up
// as a single routine would slow transfer down with extra checking.

if that is so Why not write a dedicated routines for MSBFIRST and LSBFIRST instead of reversing the byte?

That said, writing such a lib is not trivial and it is easier to react on it than to write one oneself, so stay challenged to improve it !
(in other words: well done :wink:

robtillaart:

void SPIsoft::end(){

// Do nothing
}



think you should explicitly set output lines to LOW I think
Have done.



// To keep the port transfer stable and fast the bits are always rotated out in the same direction
  // so if the bit order needs changing then do it now. *** Needs a faster routine
  if(_bitOrder == MSBFIRST) {                   // Need to swap bit order?
    tmpByte = 0;
    for(i = 0; i < 8; i++){
      tmpByte |= ((_data >> i) & 1) << (7 - i);
    }
    _data = tmpByte;
  }



There is a whole discussion here - http://forum.arduino.cc/index.php?topic=203098.0 - with way faster code 
(you should not be afraid of a "bit magic"

personal favorite


byte reverseX(byte x){
  x = (((x&0xaa)>>1) + ((x&0x55)<<1));
  x = (((x&0xcc)>>2) + ((x& 0x33)<<2));
  return((x>>4) + (x<<4));
}//reverse



to understand take paper and pencil and be the processor...
I had a remmed out version of the 16 byte lookup table in the code but had not gotten round to testing it (pulled from www)

robtillaart:
// Call relevent routine to transfer data. Written as separate routines to keep speed up
// as a single routine would slow transfer down with extra checking.

if that is so Why not write a dedicated routines for MSBFIRST and LSBFIRST instead of reversing the byte?

Yes it does peeve me a bit that MSBFIRST takes a double hit (before and after transfer) but it's the age old trade-off between speed and compact code. It would be nice to know how many bytes highCPHA & lowCPHA functions consume to see if the byte overhead of writing 2 extra routines is worth it.
I had also considered writing data out one way and reading back the opposite so one bit swap was needed but for both MSB & LSB (balancing the current time difference).

Riva:

robtillaart:
// Call relevent routine to transfer data. Written as separate routines to keep speed up
// as a single routine would slow transfer down with extra checking.

if that is so Why not write a dedicated routines for MSBFIRST and LSBFIRST instead of reversing the byte?

Yes it does peeve me a bit that MSBFIRST takes a double hit (before and after transfer) but it's the age old trade-off between speed and compact code. It would be nice to know how many bytes highCPHA & lowCPHA functions consume to see if the byte overhead of writing 2 extra routines is worth it.
I had also considered writing data out one way and reading back the opposite so one bit swap was needed but for both MSB & LSB (balancing the current time difference).

From a user point of view I prefer the library to be equally fast for MSBFIRST and LSBFIRST. There is no (known to me) argument which of the two should be the fastest.

If you really want squeeze performance one could make 2 libraries one MSB shiftout and one LSB shiftout. I expect 99% of the sketches will only use one of the two.

robtillaart:
From a user point of view I prefer the library to be equally fast for MSBFIRST and LSBFIRST. There is no (known to me) argument which of the two should be the fastest.

I have decided to pull the library for further work. I will address the MSBFIRST time hit by rotating in in the other direction or adding separate functions.

The library is back as v0.3 (see first post in this thread.)

some remarks for next version :wink:

Small functions like

  void SPIsoft::setBitOrder(uint8_t bitOrder){
    _bitOrder = bitOrder;
  }

can also be encoded in the .h file

IIRC that gives the compiler some means to optimize it (in line) before linking,


You should keep Serial.print() statements like this out of the library as they can mess up the serial communication in a program.
Or make the conditional by wrapping a #ifdef around it.

#ifdef DEBUG
     Serial.println("SPI_MODE0");
#endif

in the top of the file you state #define DEBUG 1 to enable debug


Catch runtime errors in the switches!

  void SPIsoft::setClockDivider(uint8_t clockDivider){
    switch (clockDivider){
      case  SPI_CLOCK_DIV32:
      _clkDelay = 1;
      break;
      case  SPI_CLOCK_DIV64:
      _clkDelay = 2;
      break;
      case  SPI_CLOCK_DIV128:
      _clkDelay = 3;
      break;
      default:  // all wrong values are mapped upon SPI_CLOCK_DIV32
     _clkDelay = 1;
     break;      
    }
  }

      if (*_misoPort & _misoMask){                // Read the miso port bit
        tmpByte |= j;
      }
      else{
        tmpByte &= ~j;  // isn't ~j just 0 ?
      }

this if then else can be done simpler I guess as only the if part does something

      if (*_misoPort & _misoMask){                // Read the miso port bit
        tmpByte |= j;
      }

      if (data & i){
        *_mosiPort |= _mosiMask;                  // Set mosi pin
      }
      else{
        *_mosiPort &= ~_mosiMask;                 // Clear mosi pin
      }

you could precalculate the _inverted_mosiMask = ~_mosiMask; making the expressions time more equally (to be tested)

      if (data & i){
        *_mosiPort |= _mosiMask;                  // Set mosi pin
      }
      else{
        *_mosiPort &= _inverted_mosiMask;                 // Clear mosi pin
      }

2 cents

robtillaart:
You should keep Serial.print() statements like this out of the library as they can mess up the serial communication in a program.
Or make the conditional by wrapping a #ifdef around it.

#ifdef DEBUG

Serial.println("SPI_MODE0");
#endif



Mistake on my behalf, forgot to remove them after testing. :blush:

Catch runtime errors in the switches!



void SPIsoft::setClockDivider(uint8_t clockDivider){
    switch (clockDivider){
      case  SPI_CLOCK_DIV32:
      _clkDelay = 1;
      break;
      case  SPI_CLOCK_DIV64:
      _clkDelay = 2;
      break;
      case  SPI_CLOCK_DIV128:
      _clkDelay = 3;
      break;
      default:  // all wrong values are mapped upon SPI_CLOCK_DIV32
     _clkDelay = 1;
     break;     
    }
  }



When the library is defined it sets a default of SPI_CLOCK_DIV64. As passing an unknown value should have no effect I prefer to not alter a current speed setting. IMO this could encourage sloppy programming if passing an unknown value gives predictable changes, better to not change anything or return an error.



if (*_misoPort & _misoMask){                // Read the miso port bit
        tmpByte |= j;
      }
      else{
        tmpByte &= ~j;  // isn't ~j just 0 ?
      }



this if then else can be done simpler I guess as only the if part does something
But then you alter the timings between 0 & 1 bits.



if (data & i){
        *_mosiPort |= _mosiMask;                  // Set mosi pin
      }
      else{
        *_mosiPort &= ~_mosiMask;                 // Clear mosi pin
      }



you could precalculate the _inverted_mosiMask = ~_mosiMask; making the expressions time more equally (to be tested)
I have not bothered trying to look at machine code level but assumed the ~ would equate to a single clock cycle instruction.

2 cents
Thanks for your insights and suggestions.

When the library is defined it sets a default of SPI_CLOCK_DIV64. As passing an unknown value should have no effect I prefer to not alter a current speed setting. IMO this could encourage sloppy programming if passing an unknown value gives predictable changes, better to not change anything or return an error.

OK, good choice too,
you can make that explicit in the code with an empty default and the above comment why.

Hi, i've made an small modification on transmit method. I wanted to control how many bits it send on transmit.
An example, i want to send 0x06 or only "110" bits trought SPI. Using MSB (default method) I can do this:

byte a = 0x06;
a = a<<5;
SPI.transmit(a,3);

And it send's to me only the three bits i want.

Not sure if its ok, but worked for me. If you want to revise my code, im sharing it here.

SPIsoft_beta_T.zip (3.38 KB)