Fast digital I/O and software SPI with C++ templates

20-Jul-2012

This library is no longer supported. It has been replaced by the DigitalPin library.

See this post http://arduino.cc/forum/index.php/topic,86931.0.html.

The code looks good, infact, non-type templates is exactly how I have implemented my library. I'm working on the HAL now to allow it to generate port mapping for any combination of pins ( optimising related pins ).

How many overhead bytes does the pin_map_t arrays produce?

Since the pin_map_t array is const, no flash is used to store the array. The compiler optimizes it out and almost all other statements. That's one advantage of using a template.

For pin 8 on a 328 this statement C++ statement

    sck.write(HIGH);

results in the two byte instruction

     sbi  0x05, 0

This statement C++ statement

  sck.mode(OUTPUT);

results in this two byte instruction

    sbi  0x04, 0

The second example compiles to 498 bytes using Arduino 0022. This empty sketch

void setup() {}
void loop() {}

compile to 450 bytes so that is only 48 additional bytes.

Hi fat16lib,

I have used your code and I have had great success.

I was initially trying to re-create the fastDigitalWrite macros as a template library so I could add in extra features. But I hit a wall as no matter how much I can write, I really don't know that much about arduinos internals, that's where your code comes in.

My original test code is a st7920 128x64 library running in 8-bit parallel, clearing and setting the entire screen in a loop.

3666 bytes using digital write ( shiftOut ) to control shift registers 0.7 ~ 0.9 frames per second.
3450 bytes using digital write through my parallel interface: 1.2 ~ 1.4 frames per second.

I have only implemented the fast specialisations into my shift out class.

3666 down to 3152 bytes once your fast IO was implemented,
2562 bytes Once I fully specialised my shift library.
slightly more than 6 frames per second.

I imagine the parallel interface would be exceptionally faster.

Your code has been designed in such a way that I can implement my ideas straight into it.
I'm in the middle of adding a 2,3,4,5,6,7 and 8 pin specialisations that will take more than 1 pin ( up to 8 ) and write the pins on the same port together minimising the total number of write operations needed. This would be another huge increase in speed for my library as it can interface like pins.

Your code at the bottom of my hardware access system has already given me a 5 to 7 times increase in speed. And I hope to squeeze even more out of it.

Great Stuff!

Wow, this looks really good.

Is it possible to define arrays of pins, so that you can access them with myPin*.write(HIGH)? *
Thanks,
David

Really thanks for this! I've used succesfully in my experiments with mega1284p (thanks to include the pin maps for this processor!!!) since digitalWriteFast cannot work.
I get stuck when try to use FastDigitalIO inside a library (in my case liquidCrystal).
Did someone know how to put instances inside an existing library?
I've try this:

void LiquidCrystal::init(uint8_t fourbitmode, unsigned char rs, unsigned char rw, unsigned char enable,
			 unsigned char d0, unsigned char d1, unsigned char d2, unsigned char d3,
			 unsigned char d4, unsigned char d5, unsigned char d6, unsigned char d7)
{
  _rs_pin = rs;
  _rw_pin = rw;
  _enable_pin = enable;
  
  _data_pins[0] = d0;
  _data_pins[1] = d1;
  _data_pins[2] = d2;
  _data_pins[3] = d3; 
  _data_pins[4] = d4;
  _data_pins[5] = d5;
  _data_pins[6] = d6;
  _data_pins[7] = d7; 

  FastDigitalIO<_rs_pin> rsPin;
  
  FastDigitalIO<_enable_pin> enablePin;
....

but I've got
error: 'LiquidCrystal::_rs_pin' cannot appear in a constant-expression

Ok, just to understand how to use this library in another library (this is why I've used a lot digitalWriteFast in then past), I've created a simple and silly library called blink (it just HIGH/LOW a port).

blink.cpp

#if ARDUINO >= 100
#include "Arduino.h"
#else
#include "WProgram.h"
#endif
#include "blink.h"

#include <../FastDigitalIO/FastDigitalIO.h>

blink::blink(unsigned char pinA){
	FastDigitalIO<pinA> blinkPin;
	setup(pinA);
}

void blink::setup(unsigned char pinA) {
  _thePin = pinA;
  blinkPin.mode(OUTPUT);
}

void blink::blinking(uint8_t d) {
	if (d == 1){
		blinkPin.write(HIGH);
	} else {
		blinkPin.write(LOW);
	}
}

blink.h

#ifndef BLINK_H
#define BLINK_H
#include <inttypes.h>

class blink {
  public:
	blink(unsigned char pinA);
    void setup(unsigned char pinA);
    void blinking(uint8_t d);
    
  private:
    unsigned char _thePin;   
};

#endif

blink.ino

#include <blink.h>

blink led(1);


void setup() {
  // put your setup code here, to run once:

}

void loop() {
  // put your main code here, to run repeatedly: 
  led.blinking(1);
  delay(1000);
  led.blinking(0);
  delay(1000);
}

Running this produce the following errors...

In file included from C:\M\arduino-1.0.1\libraries\blink\blink.cpp:8:
C:\M\arduino-1.0.1\libraries\blink/../FastDigitalIO/FastDigitalIO.h: In member function 'void FastDigitalIO::mode(uint8_t)':
C:\M\arduino-1.0.1\libraries\blink/../FastDigitalIO/FastDigitalIO.h:37: error: 'digitalPinMap' was not declared in this scope
....
C:\M\arduino-1.0.1\libraries\blink\blink.cpp:11: error: 'pinA' cannot appear in a constant-expression
C:\M\arduino-1.0.1\libraries\blink\blink.cpp:11: error: template argument 1 is invalid
C:\M\arduino-1.0.1\libraries\blink\blink.cpp:11: error: invalid type in declaration before ';' token
C:\M\arduino-1.0.1\libraries\blink\blink.cpp: In member function 'void blink::setup(unsigned char)':
C:\M\arduino-1.0.1\libraries\blink\blink.cpp:17: error: 'blinkPin' was not declared in this scope
...

Any help?

Firstly:
FastDigitalIO.h:37: error: 'digitalPinMap'

You need to look at line 37 in FastDigitalIO.h.

Secondly, something seems wrong about this line to me. I have never seen it before. Maybe it is right. :S
FastDigitalIO blinkPin;

The problem is that template arguments must be constant -- either a straight number, #define, or something declared with const. The only way around it is to make the pin you're feeding into the digitalWriteFast template a template argument for the class itself.

blink.h:

#ifndef BLINK_H
#define BLINK_H
#include <inttypes.h>

template <unsigned char pinA>
class blink {
  public:
	blink();
    void setup();
    void blinking(uint8_t d);
    
  private:
    unsigned char _thePin;   
};

#endif

blink.cpp

#if ARDUINO >= 100
#include "Arduino.h"
#else
#include "WProgram.h"
#endif
#include "blink.h"

#include <../FastDigitalIO/FastDigitalIO.h>

template <unsigned char pinA>
blink::blink(){
	FastDigitalIO<pinA> blinkPin;
	setup();
}

template <unsigned char pinA>
void blink::setup() {
  _thePin = pinA;
  blinkPin.mode(OUTPUT);
}

void blink::blinking(uint8_t d) {
	if (d == 1){
		blinkPin.write(HIGH);
	} else {
		blinkPin.write(LOW);
	}
}

... or something like that

FastDigitalIO has evolved into DigitalPin http://arduino.cc/forum/index.php/topic,86931.0.html.

You must use constant pin numbers with any of the fast digital I/O libraries. That's the only way the compiler can optimize to the fast sbi and cbi instructions. This means you may need to use templates to use a fast I/O library in another library.

I will include an example software SPI library based on DigitalPin in the next version of DigitalPin.

I am using DigitalPin and SoftSPI in the 20120719 version of SdFat Google Code Archive - Long-term storage for Google Code Project Hosting.. The files for DigitalPin and SoftSPI are in the utility folder of SdFat. I will soon post these files as a standalone library.

Thanks Fat16Lib! I will wait your examples! Thanks for sharing this!!!
Winezed, thanks a lot for help me, I never use templates and it's a great opportunity to study your solution!

Hi , thanks for this fine library! I wrote a decoder DCF77 library (for the atomic time broadcasted by the DCF77 radiostation) (see this thread) where I want to quickly read and discard small pulses. I think your library may help perform peak rejection more efficiently.

  • I was wondering, though, is the read out speed limited by processor cycles, or by the hardware? Have you been able to measure that? I can imagine as a test: connect a pulse generator to pin 1, a scope to pin 2. Read the pin 1 value and write it to 2.
  • As I understand, the standard Arduino implementation needs many more instructions. Are those extra instructions in any way useful? Or, in other words, does your library also have drawbacks?

pYro_65:
I was initially trying to re-create the fastDigitalWrite macros as a template library so I could add in extra features. But I hit a wall as no matter how much I can write, I really don't know that much about arduinos internals, that's where your code comes in.

This is the only thread that i found mentioning templated digitalwritefasts .. so posting here.

There is already a very cool templated fastPins header here USB_Host_Shield_2.0/avrpins.h at master · felis/USB_Host_Shield_2.0 · GitHub

However, by default its syntax is different/object oriented, so a blink example would look like this:

#include <avrpins.h>

void setup() {
  P13::SetDirWrite();
}

void loop() {
  P13::Set();
  delay(1000);
  P13::Clear();
  delay(1000);
}

With a bit of preprocessor help, and partial template class specialization its easy to translate digitalWrite syntax to this syntax without creating any runtime overhead. Call this header digitalWriteFast3.h :

#include <avrpins.h>

template <class T, int N>
struct fast3_wrap_impl
{
    static T setmode() {
		T::SetDirWrite();
	} 
	static T write() {
		T::Set();
	}
};

template <class T>
struct fast3_wrap_impl<T, 0>
{
    static T setmode() { 
		T::SetDirInput();
	}
	static T write() {
		T::Clear();
	}
};

template <class T, int N>
void fast3_wrap_setmode()
{
    fast3_wrap_impl<T, N>::setmode();
}

template <class T, int N>
void fast3_wrap_write() { 
	fast3_wrap_impl<T, N>::write();
}

template <class T>
int fast3_wrap_read() { 
	return T::IsSet();
}

#define pinModeFast3(a, b) fast3_wrap_setmode< P##a , b>();
#define digitalWriteFast3(a,b) fast3_wrap_write< P##a , b >();
#define digitalReadFast3(a,b) fast3_wrap_read< P##a >();

After that, rewriting the loop example:

#include <digitalWriteFast3.h>

void setup() {
  pinModeFast3(13, OUTPUT);
}

void loop() {
  digitalWriteFast3( 13, HIGH );
  delay(1000);
  digitalWriteFast3( 13 , LOW ) ;
  delay(1000);
}

Feel free to modify and improve. I have not tested this beyond the simple blink, but i trust that as long avrpins.h does things correctly, it should be fairly safe to use.

I have verified that these digitalWriteFast3 calls resolve into single cbi/sbi instructions.

Maybe the author of the avrpins.h could add this template stuff in, as a user you would be able to use either syntax without thinking about it, and still get the same result.

The library in this thread didn't work well for use in other bit-bang libraries so I have rewritten it. One of the main reasons for fast digital I/O is bit-bang for protocols like SPI.

Please see DigitalPinBeta20120804.zip Google Code Archive - Long-term storage for Google Code Project Hosting..

Also see this topic http://arduino.cc/forum/index.php/topic,117356.0.html.

I suspect the templates kert describes above will have the same problems.

I still include template classes for simple use in sketches but I base these classes on static functions with constant arguments.

These static functions are easier to use in other bit-bang libraries like a software SPI library which is included as an example in the new library.

Thanks, you should add your libraries here Arduino Playground - LibraryList
There were so many different fast i/o threads and half-solution libraries around .. gaah.

Anyway, i just realized my posted code is crap anyway, as you cant use variables as pin indexes, only numeric constants :slight_smile: