fast shiftOut!

Hi All

Looking around, i did not found a faster replacement for the shiftOut procedure (http://www.arduino.cc/en/Reference/ShiftOut) which is also compatible to the existing procedure. So i wrote a new procedure:

  • Same interface as shiftOut
  • 3-4x faster
  • Restrictions: Does not turn of PWM and does not modify port directions

The restrictions mean, that you have to configure the clock and data pin correctly before using the replacement.

digitalWrite(clockPin, LOW);
digitalWrite(dataPin, LOW);

should be fine for this.

I did some performance tests with the SpaceTrash game form the DOGM128 Library (Arduino Uno, DOGS102 Display)
DOUBLE_MEM, shiftOut: 9 FPS (frames per second)
DOUBLE_MEM, shiftOutFast: 31 FPS
DOUBLE_MEM, Hardware SPI: 42 FPS

Here is the code:

#include "wiring_private.h"
#include "pins_arduino.h"

void shiftOutFast(uint8_t dataPin, uint8_t clockPin, uint8_t bitOrder, uint8_t val)
{
  uint8_t cnt;
  uint8_t bitData;
  uint8_t bitClock;
  volatile uint8_t *outData;
  volatile uint8_t *outClock;
  uint8_t dataLow, dataHigh, clockLow, clockHigh;

  outData = portOutputRegister(digitalPinToPort(dataPin));
  outClock = portOutputRegister(digitalPinToPort(clockPin));
  bitData = digitalPinToBitMask(dataPin);
  bitClock = digitalPinToBitMask(clockPin);

  dataHigh = *outData;
  dataHigh |=  bitData;
  bitData ^= 0x0ff;
  dataLow = dataHigh;
  dataLow &= bitData;
	
  clockHigh = *outClock;
  clockHigh |=  bitClock;
  bitClock ^= 0x0ff;
  clockLow = clockHigh;
  clockLow &= bitClock;

  cnt = 8;
  if (bitOrder == LSBFIRST)
  {
    do
    {
      if ( val & 1 )
	*outData = dataHigh;
      else
	*outData = dataLow;
      
      *outClock = clockHigh;
      val >>= 1;
      *outClock = clockLow;
      cnt--;
    } while( cnt != 0 );
  }
  else
  {
    do
    {
      if ( val & 128 )
	*outData = dataHigh;
      else
	*outData = dataLow;
      
      *outClock = clockHigh;
      val <<= 1;
      *outClock = clockLow;
      cnt--;
    } while( cnt != 0 );
  } 
}

Oliver

Oliver,

I don't see where you allow for the clock and data to be in the same port (clock byte setting overwrites data byte), or where you deal with an interrupt handler trying to change another bit in either port during your shift.

Here is a version that I implemented for the GLCD port for the pcd8544 (https://github.com/MikeSmith/glcd_pcd8544). It deals with both of those issues, and gave a roughly 5x frame rate improvement. Obviously for LSB first you would replace the loop starting value and shift direction.

...setup...
	// Configure display interface pins
	pinMode(GLCD_SCLK, OUTPUT);
	_sclk.reg = portOutputRegister(digitalPinToPort(GLCD_SCLK));
	_sclk.bit = digitalPinToBitMask(GLCD_SCLK);

	pinMode(GLCD_SDIN, OUTPUT);
	_sdin.reg = portOutputRegister(digitalPinToPort(GLCD_SDIN));
	_sdin.bit = digitalPinToBitMask(GLCD_SDIN);
...


void
glcd_Device::_shiftOut(uint8_t data)
{
	register uint8_t mask;
	register uint8_t sdin_high = _sdin.bit;
	register uint8_t sdin_low = ~sdin_high;
	register uint8_t sclk_high = _sclk.bit;
	register uint8_t sclk_low = ~sclk_high;

	// must be volatile as they may alias
	register volatile uint8_t *sdin = _sdin.reg;
	register volatile uint8_t *sclk = _sclk.reg;

	// mask interrupts while we shift the byte out
	uint8_t sreg = SREG;
	cli();

	for (mask = 0x80; mask; mask >>=1) {
		if (data & mask) {
			*sdin |= sdin_high;
		} else {
			*sdin &= sdin_low;
		}
		*sclk |= sclk_high;
		*sclk &= sclk_low;
	}

	// restore interrupt state
	SREG = sreg;
}

= Mike

Hi Mike

Good point. You are right. Here is a corrected version (however FPS dropped to 30):

void shiftOutFast(uint8_t dataPin, uint8_t clockPin, uint8_t bitOrder, uint8_t val)
{
  uint8_t cnt;
  uint8_t bitData, bitNotData;
  uint8_t bitClock, bitNotClock;
  volatile uint8_t *outData;
  volatile uint8_t *outClock;
 
  outData = portOutputRegister(digitalPinToPort(dataPin));
  outClock = portOutputRegister(digitalPinToPort(clockPin));
  bitData = digitalPinToBitMask(dataPin);
  bitClock = digitalPinToBitMask(clockPin);

  bitNotClock = bitClock;
  bitNotClock ^= 0x0ff;

  bitNotData = bitData;
  bitNotData ^= 0x0ff;

  cnt = 8;
  if (bitOrder == LSBFIRST)
  {
    do
    {
      if ( val & 1 )
	*outData |= bitData;
      else
	*outData &= bitNotData;
      
      *outClock |= bitClock;
      *outClock &= bitNotClock;
      val >>= 1;
      cnt--;
    } while( cnt != 0 );
  }
  else
  {
    do
    {
      if ( val & 128 )
	*outData |= bitData;
      else
	*outData &= bitNotData;
      
      *outClock |= bitClock;
      *outClock &= bitNotClock;
      val <<= 1;
      cnt--;
    } while( cnt != 0 );
  } 
}

Thanks,
Oliver

Hi Oliver,

Thanks very much for sharing your code.

Would you consider specifying a license for sharing the code?

As indicated on the Arduino FAQ, the Arduino C/C++ microcontroller libraries are under the LGPL. It would be great if you would consider sharing your shiftOutFast code under the LGPL or a compatible license (e.g., BSD, MIT, etc.).

Thanks again!

Cheers,
Christian