I would like to share what I think is a very slick programming technique that
both the Arduino development team as well as Arduino users may well be
interested in -- Achieving very high performance pin I/O. Fortunately,
this issue is not new and searches in the Forum has turned up quite a few hits
on this subject so I hope and expect this post will draw some interest.
My take on Arduino I/O is that it is unnecessarily slow for the following
reasons:
-
The Arduino environment uses some on-chip memory to store look-up
tables that map an Arduino pin to a port output register, port
direction register, pin input register, as well as a port bit number.
These onchip lookup tables incur runtime overhead whenever Arduino
functions pinMode(), digitalRead(), and digitalWrite() access these
tables during every function call. -
Although AVR processors have ideal instructions for high performance
manipulation of I/O PORT registers (i.e. sbi/cbi instructions) the
compiler seems to never issue them. -
Examination of the actual code for the functions pinMode(),
digitalRead(), and digitalWrite() appear (to me) as being bloated
with more functionality than is necessary. Specifically, interrupts
are turned off/on and system status register is saved/restored
during every call which slows things down yet more. -
On-chip lookup tables consume chip resources (i.e. RAM) that take
away from what a user can use. Although this is not a performance
issue rather it is a resouce overhead issue.
The programming technique that I am experimenting with has the following
characteristics (compare them with the above points):
-
No on chip lookup tables are required which means there is no
run time overhead. -
The technique issues atomic sbi/cbi instructions that don't require
turning off/on interrupts and saving/restoring system status
register. -
The technique issues atomic sbi/cbi instuctions that replace
entire function calls so this overhead is also eliminated (As far as
I know, all I/O PORT registers are within the addressing range of the
cbi/sbi instructions). -
No on chip lookup tables are used so more on chip resources are
available to users. -
This technique can co-exist with the current Arduino environment but
it is also capable of eliminating the need to call pinMode(),
digitalRead(), and digitalWrite().
So, what is this technique? I'm glad you asked.
The technique is based on a clever use of MACROS to store and access the pin
mapping table. Macros can be defined to hold a lookup table (rows and columns
of information) and other macros are defined to extract and use specific
pieces of that information. Since macros are expanded during the preprocessing
stage of the compiler there is no need to store this lookup table on chip.
The best way of explaining how all this works is to illustrate these macros and
and provide an example sketch that uses them. This example is specific to the
ATmega328 but is easily generalized to other AVR architectures. Consider the
following blinking LED sketch (feel free to copy and paste it to the Arduino
IDE and compile/run it):
// Blinking LED sketch demonstration
#define _LB5 0x05, 0x04, 0x03, PORTB, DDRB, PINB, 5 // table entry
#define _D13 _LB5 // Arduino Pin 13 is an alias to bit 5 of PORTB
#define __PIN_O( o, d, i, O, D, I, B ) asm( "sbi " #d ", " #B ) // set DDR
#define __PIN_T( o, d, i, O, D, I, B ) asm( "sbi " #i ", " #B ) // toggle PORT
#define _PIN_CONFIG_OUT( P ) __PIN_O( P ) // equiv to pinMode(P,OUTPUT)
#define _PIN_TOGGLE( P ) __PIN_T( P ) // no equivalent in Arduino
#define LED _D13 // user alias to Arduino pin 13 called 'LED'
void setup() {
_PIN_CONFIG_OUT( LED ); // equivalent to pinMode(13,OUTPUT)
}
void loop() {
for (;;) { // toggle loop
_PIN_TOGGLE( LED ); // flip state of LED
delay( 500 );
}
}
'#define _LB5' is one row of the lookup table. For this example I show one
row but in the 'A328_PINS.h' file that I am attaching shows table entries for all
I/O port pins (Again, this for the ATmega328 only). This macro expands to a 7
element comma separated argument list consisting of the address of the port B
output register (0x05), address of the port B data direction register (0x04),
address of port B input register (0x03), assembler names for these registers
(PORTB, DDRB, and PINB), and the bit position (5) within this port.
'#define _D13' defines an alias name for table entry '_LB5'. Table entries do
not need to be referenced directly. As will be seen, the macros will work just
fine with as many aliased names to table entries as needed.
'#define __PIN_O( o, d, i, O, D, I, B )' is the low level macro that extracts
the needed information from a table row and generates the machine instruction
that sets the data direction register appropriately. Note that even though 7
arguments are provided the macro only references only 2 of them.
'#define __PIN_T( o, d, i, O, D, I, B )' is the low level macro that extracts
the needed information from a table row and generates the machine instruction
that toggles/flips the state of the output Port. Note that this functionality
is not accessible from the Arduino environment even though the architecture
supports it.
'#define _PIN_CONFIG_OUT( P )' is a user level wrapper macro around the low
level __PIN_O macro and it is used to control the order in which the __PIN_O
macro is expanded.
'#define _PIN_TOGGLE( P )' is a user level wrapper macro around the low
level __PIN_T macro and it is used to control the order in which the __PIN_T
macro is expanded.
'#define LED' is a user level declaration that creates an alias to Arduino pin
13 and calls it 'LED'
We will now trace the expansion of the macro call _PIN_CONFIG_OUT(LED) in the
function call setup(). The expansion follows the rules of the compiler
preprocessor:
- _PIN_CONFIG_OUT(LED) expands to:
- __PIN_O(LED) but LED is itself a macro so it is expanded to:
- __PIN_O(_D13) but _D13 is itself a macro so it is expanded to:
- __PIN_O(_LB5) but _LB5 is itself a macro so it is expanded to:
- __PIN_O(0x05, 0x04, 0x03, PORTB, DDRB, PINB, 5) which finally expands to:
- asm( "sbi " #0x04 ", " #5 ) which expands to:
- asm( "sbi " "0x04" ", " "5" ) which finally expands to:
- asm( "sbi 0x04, 5" ) which generates one atomic instruction to set
the data direction register appropriately
The expansion of the macro _PIN_TOGGLE(LED) in loop() follows the same process:
- _PIN_TOGGLE(LED) expands to:
.
.
. - asm( "sbi 0x03, 5" ) which generates one atomic instruction to flip
the bit of the output port register connected to
the LED
That's it!! These well crafted macros result in:
- look-up tables that take zero chip resources
- There is zero run time overhead
- In most cases, the macros expand to a single atomic machine instruction
- Results in maximum I/O speed!
- The user level macros like _PIN_CONFIG_OUT() are simple to read and can
appear to a user as no different than a procedure call. - The macros can co-exist with the Arduino environment functions pinMode(),
digitalRead(), and digitalWrite() but could replace them if the Arduino
development team chooses. - Custom lookup macro tables for specific AVR architectures are simple to
generate. - The macro table technique lends itself to other uses as well -- it some
cases, it can replace a lot of conditional directives like the following
with macro table lookups:
#ifdef something
.
.
#else
.
.
#endif
For comparison purposes, if the 'delay(500)' command is commented out in the
example sketch then the _PIN_TOGGLE loop will compile to just 2 machine
instructions that take a total of 4 cycles to execute. The loop has to execute
twice for each full symetrical output square wave cycle (8 cycles) so at 16 Mhz
clock rate, the LED pin will toggle at a 2 Mhz rate which is about 20X higher
than what can be achieved with digitalWrite() calls!!
Feel free to copy the attached A328_PINS.h file to your library directory.
It will be interesting to discover other uses for this macro technique.
I welcome your questions and/or comments.
ENJOY!!
A328_PINS.h (4 KB)