digitalWriteFast, digitalReadFast, pinModeFast etc

I will work on Paul's ideas and try to have an improved version posted by 3/10. This project wouldn't even have begun without his insight. As I said before he did all the hard work on this.

I did update the version on the server tonight. I did not implement all of Paul's suggestions. I did surround all of the pin to port/timer type functions with a single #if, #endif that I hope will be adequate to the needs of 3rd party boards. I similarly surrounded each of the macros for pinModeFast,digitalWriteFast,digitalReadFast,pinModeFast2,digitalWriteFast2, digitalReadFast2 with #if #endif statements.

Issue #146 relates to interference between interrupt routines that contain digitalWrite instructions(eg Servo and Tone Libraries) and non-interrupt code. It is clearly a real issue and will be a difficult one to sort out for most people who encounter it. On the other hand, the point of these library macros is to improve efficiency. I don't know that interrupts will be in use and don't know if digitalWrite will be used inside the interrupt routines in the code that uses these macros.

I gave real thought to trying to implement something that would be interrupt safe for one of the 2 versions. Perhaps as I gain proficiency I will return to this idea.

I bought a teensy++ from Paul. I haven't even gotten so far as to solder in headers, but I looked at some of the software that comes with the board to tie it to the Arduino IDE. Paul doesn't advertise that he's not only implemented THESE ideas as part of the digitalWrite etc commands there but he's ALSO greatly speeded up the way digitalWrite etc work when the pin number is not known at compile time. It looks to me like he's also implemented the code necessary to protect against the interrupt interference of issue 146; there is some conditionally included code that implies something like that. That would mean that his boards would work better with the servo and tone libraries than the branded Arduinos. The code itself combines complex macros and assembly language so that it is very tough reading.

All of which makes me think that the teensy++ may really be the board to go with. Its not just a tiny arduino clone but the software is actually enhanced in very important ways. I'm still trying to figure out how to tie something this small to a useful prototyping shield though.

@jrraines: Would you mind publishing your code somewhere else? The link you provided isn't working.

I suggest removing the check "__builtin_constant_p(V)" from digitalWriteFast...

#define digitalWriteFast(P, V) \
  if (__builtin_constant_p(P) [glow]&& __builtin_constant_p(V)[/glow]) { \
    if (digitalPinToTimer(P)) \
      bitClear(*digitalPinToTimer(P), digitalPinToTimerBit(P)); \
    bitWrite(*digitalPinToPortReg(P), digitalPinToBit(P), (V)); \
  } else { \
    digitalWrite((P), (V)); \
  }

Relatively speaking, the call to digitalWrite generates one machine instruction (a relative call). After removing the built-in check on V and passing a variable into the macro...

For non-PWM pins, five machine instructions are generated.

For PWM pins, eight machine instructions are generated.

In my opinion, this is a small price to pay for the huge increase in speed. If someone wants to save program space, they can call digitalWrite directly.

http://code.google.com/p/digitalwritefast/downloads/list

Mellis and Stoffregen felt that it was safer to include the PWM stuff. I thought the digitalWriteFast2, pinModeFast2 version was preferable. you can cut out a little overhead by using pinModeFast and digitalWriteFast2.

Thanks!

Mellis and Stoffregen felt that it was safer to include the PWM stuff

Um ... I didn't suggest removing the PWM stuff. My suggestion was to apply the fast version when V is a constant or a variable.

I thought the digitalWriteFast2, pinModeFast2 version was preferable. you can cut out a little overhead by using pinModeFast and digitalWriteFast2.

Looks good to me. I like it.

I agree, it's a good trade-off. In fact, that's exactly how I implemented it in the version that's inside Teensyduino.

The slow compiled code often requires almost that many instructions just to marshal the inputs into the required registers, when either isn't a compile time const. If the surrounding code is complex, but doesn't call other functions, which can often be the case with digitalWrite, the saving in register allocation are also a big win.

But that's not "exactly" how I wrote it. This "bitWrite" macro coding style really isn't my first choice. Inside Teensyduino, I implemented this using a giant chain of if-else checks, where each performs the desired write. While it's a lot longer, a LOT longer, it has the advantage of doing nothing if an illegal pin number is used. With this macro version, if an illegal pin number is used, it will write to the last pin. On all Arduino boards, that's an analog input, likely never configured as an output, so the effect would be activating the pullup resistor on that analog in... which could be pretty confusing if the poor, misguided user didn't realize some other unrelated code mistakenly wrote an illegal pin number. For that reason, I've never been very happy with this style.

Back in November 2009, David said he intended to include this into the official Arduino core, and he preferred this macro style (I had posted a short example of the if-else way), so I wrote it this way for contribution to the official Arduino version. Sadly, with 0018 gone by, and issue #140 not tagged for 0019 or 1.0, it seems unlikely these optimizations will ever become part of the official digitalWrite. Had I known that then, I wouldn't have bothered to write these for the official Arduino boards. I'm certainly not going to put any more effort into it now, other than pointing out this lack of checking for illegal pin number input.

Then again, erroneously writing to the last pin is a lot better than what the slow compiled digitalWrite does. It will happily use the too-large pin number as an index to an array, reading whatever happens to be in memory after those tables, and use that data as a pointer and bitmask to write to someplace in memory! Not good.

Then again, there's issue 146 & 170, which also seems unlikely to ever get fixed.

It makes me sad to see so little care and concern for code quality. I think maybe it's time to turn off my notifications for this thread....

I realized that one form of documentation I should have provided from the start was examples of what the disassembly code looks like. These examples were compiled for my Mega, so pin/port correspondence may not be the same as other boards, but it will give a better idea of what is generated:

// these are non-pwm pins whose port is below 0x100
pinModeFast(51,INPUT);
    5c4e:      22 98             cbi      0x04, 2      ; 4
digitalWriteFast(51,HIGH); 
    5c50:      2a 9a             sbi      0x05, 2      ; 5
pinModeFast(50,OUTPUT);
    5c52:      23 9a             sbi      0x04, 3      ; 4
digitalWriteFast(50,LOW);
    5c54:      2b 98             cbi      0x05, 3      ; 5

//these pins are on a port above 0x100
pinModeFast2(48,INPUT);
    5bb2:      80 91 0a 01       lds      r24, 0x010A
    5bb6:      8d 7f             andi      r24, 0xFD      ; 253
    5bb8:      80 93 0a 01       sts      0x010A, r24
digitalWriteFast2(48,LOW);
    5bbc:      80 91 0b 01       lds      r24, 0x010B
    5bc0:      8d 7f             andi      r24, 0xFD      ; 253
    5bc2:      80 93 0b 01       sts      0x010B, r24
pinModeFast2(49,OUTPUT);
    5bc6:      80 91 0a 01       lds      r24, 0x010A
    5bca:      81 60             ori      r24, 0x01      ; 1
    5bcc:      80 93 0a 01       sts      0x010A, r24
digitalWriteFast2(49,HIGH);
    5bd0:      80 91 0b 01       lds      r24, 0x010B
    5bd4:      81 60             ori      r24, 0x01      ; 1
    5bd6:      80 93 0b 01       sts      0x010B, r24


//these are pwm with a port address below 0x100:
pinModeFast(2,INPUT);
     32c:      6c 98             cbi      0x0d, 4      ; 13
digitalWriteFast(2,HIGH); 
     32e:      80 91 90 00       lds      r24, 0x0090
     332:      8f 7d             andi      r24, 0xDF      ; 223
     334:      80 93 90 00       sts      0x0090, r24
     338:      74 9a             sbi      0x0e, 4      ; 14
pinModeFast(5,OUTPUT);
     33a:      6b 9a             sbi      0x0d, 3      ; 13
digitalWriteFast(5,LOW);
     33c:      80 91 90 00       lds      r24, 0x0090
     340:      8f 77             andi      r24, 0x7F      ; 127
     342:      80 93 90 00       sts      0x0090, r24
     346:      73 98             cbi      0x0e, 3      ; 14

pinModeFast2(2,INPUT);
     908:      80 91 90 00       lds      r24, 0x0090
     90c:      8f 7d             andi      r24, 0xDF      ; 223
     90e:      80 93 90 00       sts      0x0090, r24
     912:      6c 98             cbi      0x0d, 4      ; 13
digitalWriteFast2(2,LOW);
     914:      74 98             cbi      0x0e, 4      ; 14
pinModeFast2(5,OUTPUT);
     916:      80 91 90 00       lds      r24, 0x0090
     91a:      8f 77             andi      r24, 0x7F      ; 127
     91c:      80 93 90 00       sts      0x0090, r24
     920:      6b 9a             sbi      0x0d, 3      ; 13
digitalWriteFast2(5,HIGH);
     922:      73 9a             sbi      0x0e, 3      ; 14

//some of these have an address above 0x100:
pinModeFast(12,INPUT);
    247a:      26 98             cbi      0x04, 6      ; 4
digitalWriteFast(12,HIGH); 
    247c:      80 91 80 00       lds      r24, 0x0080
    2480:      8f 7d             andi      r24, 0xDF      ; 223
    2482:      80 93 80 00       sts      0x0080, r24
    2486:      2e 9a             sbi      0x05, 6      ; 5
pinModeFast(9,OUTPUT);
    2488:      80 91 01 01       lds      r24, 0x0101
    248c:      80 64             ori      r24, 0x40      ; 64
    248e:      80 93 01 01       sts      0x0101, r24
digitalWriteFast(9,LOW);
    2492:      80 91 b0 00       lds      r24, 0x00B0
    2496:      8f 7d             andi      r24, 0xDF      ; 223
    2498:      80 93 b0 00       sts      0x00B0, r24
    249c:      80 91 02 01       lds      r24, 0x0102
    24a0:      8f 7b             andi      r24, 0xBF      ; 191
    24a2:      80 93 02 01       sts      0x0102, r24

@jrraines: Thank you. That's helpful.

@Paul Stoffregen:

Inside Teensyduino, I implemented this using a giant chain of if-else checks, where each performs the desired write. While it's a lot longer, a LOT longer, it has the advantage of doing nothing if an illegal pin number is used.

I prefer the chain for a different reason. With some clever formatting, it can be made to look like a table. Ensuring the Arduino-pin to port-pin mapping is accurate is easy.

Back in November 2009, David said he intended to include this into the official Arduino core, and he preferred this macro style (I had posted a short example of the if-else way), so I wrote it this way for contribution to the official Arduino version. Sadly, with 0018 gone by, and issue #140 not tagged for 0019 or 1.0, it seems unlikely these optimizations will ever become part of the official digitalWrite.

That's unfortunate. I REALLY like these digital*Fast functions...

Had I known that then, I wouldn't have bothered to write these for the official Arduino boards. I'm certainly not going to put any more effort into it now, other than pointing out this lack of checking for illegal pin number input.

I definately appreciate your and jrraines effort. I'm trying to squeeze an application onto a 2313. I've determined that without these functions, I would have had to resort to port manipulation.

Who can't love these things! High level function calls reduced to a single machine instruction! It's the best of hand-assembly and C++.

Then again, erroneously writing to the last pin is a lot better than what the slow compiled digitalWrite does. It will happily use the too-large pin number as an index to an array, reading whatever happens to be in memory after those tables, and use that data as a pointer and bitmask to write to someplace in memory! Not good.

That is a bit unnerving.

It makes me sad to see so little care and concern for code quality. I think maybe it's time to turn off my notifications for this thread....

PLEASE stay with us (me)!

I'm only an Arduino beginner, but I'm not sure I understand why it is that revised functionality with identical behavior as before (thus no API change) yet improved performance are not getting committed?

What am I missing?

I think that is Paul's point. In many situations this will be both faster and also smaller.

I have encountered at least one situation where I removed delays knowing that digitalWrite was slow and doubted the code would work if digitalWrite was speeded up. That is the logic behind calling the functions by different names.

Thanks for your reply, jrraines!

Also, only now, after looking in more detail at DigitalWriteFast (I also hadn't seen the excellent and concise description at http://code.google.com/p/digitalwritefast/ yet), I understand how foolish my question was. DigitalWriteFast requires pin numbers to be known at compile time. Period. i.e. digitalWrite(9, HIGH) can be sped up, digitalWrite(i, HIGH) can't.

Likely this will be obvious right away to most readers of this topic, but to me, as a novice, it wasn't.

Thanks for your time! :)

Westfw somewhere remarked that consistent speed of digitalWrite may be desirable. Another consideration.

why it is that revised functionality with identical behavior as before (thus no API change) yet improved performance are not getting committed?

Part of the problem is in EXPLAINING the new functions. None of the Arduino functions currently include information about execution speed, and for most applications it isn't important. Far slower systems have existed and solved problems. Now suddenly we want to add "fast" versions, and the possibility that they will introduce confusion is rather high. "when do I need the fast function? Do I need fastSerial.Print too? fastAnalogRead? I changed blink to use the fast functions and it's still blinking once per second?"

DigitalWriteFast requires pin numbers to be known at compile time. Period. i.e. digitalWrite(9, HIGH) can be sped up, digitalWrite(i, HIGH) can't.

Those are two different statements. DigitalWriteFast is quite careful to "work" with pin numbers that are variables. It just won't be any faster. (adds more confusion, you know. "Isn't the pin number a constant in the blink example? How come sometimes it runs fast and sometimes it runs slowly?")

Fully agreed.

Your explanations are far more exact and unambiguous. That's why I'm a novice and you're an expert :)

Have you tested digitalWriteFast(i++); (assuming i is a defined variable) ?

I don't think I'd used that specific syntax. I just ran a simple example and it seems to give the correct result. Have you had a problem?

uint8_t i=18;
digitalWriteFast(i++,HIGH);
lcd(0,0)<<"19 = "<<(long)i;

Part of the problem is in EXPLAINING the new functions.

By that logic, no improvements would ever be added.

David Mellis specifically requested this code, and when I wrote it for normal Arduino boards, he specifically requested it implemented for the Arduino Mega, which I also did within a matter of a couple days. Difficulty of documentation was never a concern when he (and others on the developer list) wanted this, back in November 2009.

Since then, it's sat in the issue tracker for about half a year. However, he did recently flag this for the 1.0 milestone, so maybe it'll actually make it into the official Arduino core within the next 6 months?

Still not known is if this code will simply be used as digitalWrite(), or if a new name like digitalWriteFast() will be used, or if David will end up implementing it some other way. However it David ends up using this, assuming he ever does, I'm sure once it's actually committed to svn and due to be released, somehow explaining/documenting it really won't be a big deal, and if it doesn't introduce a new name, perhaps no documentation changes will be needed at all?

I've heard this "but we'd have to document it and support it" line many, many times before, usually in the corporate world by mid level managers who just don't want to do anything innovative, unless the directive comes from those above them. There is some point to it for dramatically new products, but really, in cases like this where the feature is just a performance improvement that carries virtually no risk, virtually no backwards compatibility, and is pretty much just invisible, I just don't buy that line about how difficult documentation is. It'd probably take less time than we've sent writing all these message in this thread!

I will take some of the responsibility for all this 'difficulty explaining' discussion; I think my writeup was not well done. That was partly because I'd understood the remarks from others who wondered 'why bother when using PORT etc directly could give better efficiency at run time'.

Trying to acknowledge the truth of that but point out the simplicity and ease of use of using this led to something even more circuitous than this post.

To slightly change the subject, the other thing I'd have thought would warranted prompt adoption into the core is Streaming.h--I'd have thought it should just have been added to print.h. But both David Mellis and Mikal Hart seemed in agreement that it made sense to leave it out. And I will say that leaving it out led to further improvements--Mikal Hart changing the crux of it from 7 lines to one amazingly functional line of code and Michael Margolis and others adding support for HEX etc. That development might have been impeded if it were in the core.

So there are concerns about who maintains, extends and can commit code. With digitalWriteFast there is the concern about who will revise the macros when a board more complex than the Mega comes out. When that happens I will certainly work on it; I would expect it would stretch my ability to write macros quite considerably. As I have acknowledged before I would not have gotten my small contribution to this done without building on Paul's work and without Westfw's patient guidance.

Speaking of my limited abilities with macros, Paul pointed outthe issue with pin numbers that are too high a while back. There is a #error "Your error message here." macro. I can make it work with #ifdef, but I spent an hour or two playing with how to make it work for a numeric issue and had no success. Any tips would be appreciated.