Arduino Forum

Forum 2005-2010 (read only) => Software => Bugs & Suggestions => Topic started by: jrraines on Mar 02, 2010, 07:16 pm

Title: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Mar 02, 2010, 07:16 pm
digitalWriteFast, digitalReadFast, pinModeFast etc.

This is an addition to the library that greatly speeds up reading and writing to the digital pins using the familiar pin number syntax.

One of the strengths of the Arduino is the low barrier to getting started for beginners. the simple syntax of digitalWrite, pinMode, and digitalRead is a big contributor to the simplicity. As beginners get more experience they move toward the more efficient port manipulation commands (see http://www.arduino.cc/en/Reference/PortManipulation, BUT some of the port information there is actually incorrect for the Mega, see
http://spreadsheets.google.com/pub?key=rtHw_R6eVL140KS9_G8GPkA&gid=0 ). The port manipulation commands control the same pins but refer to them in a completely different syntax and depend on specific details of the pin being known at the time the program is written. It is difficult using port manipulation to mimic the simplicity of
for (int i =2; i<=13; i++) digitalWrite(i,HIGH);

A few months ago, Paul Stoffregen proposed and worked out the important details of a somewhat intermediate version digitalWriteFast which he implemented completely as a macro. Like the port manipulation commands it is much faster than digitalWrite (over ten times faster) and the extra speed depends on knowing the pin numbers at compile time--it won't speed things up if its inside a subroutine or loop where the pin number is going to change. If the pin number is not known at compile time, it defaults to use the slower digitalWrite command. It uses the simple syntax of the digitalWrite type commands, which makes it attractive to beginners, and perhaps less error prone even to programmers who are a little more experienced.

I looked at what he had done and thought it needed just a little more work to make it a valuable library 'routine'--I put 'routine' in quotes because there's no .c file; everything is in macros in a .h file.

I extended it to include pinModeFast and (with a huge amount of assistance from Westfw) digitalReadFast. I've tested it fairly thoroughly on my Arduino Mega. It would be wise if people with other Arduinos would test those boards. Paul Stoffregen's work on defining the port and bit to pin conversion has been flawless but testing seems prudent. If you do test, I think a post here would be appropriate.

PWM, analogWrite digitalWriteFast2, etc.
As you know analogWrite works on some of the digital pins (the PWM pins) by setting a timer and cycling from +5V to ground with a duty cycle that makes the average voltage on the pin proportional to what you specify. The standard digitalRead and digitalWrite commands turn off the cycling of the timer every time they are used. pinMode does nothing to the cycling of the timer.

digitalWriteFast and digitalReadFast turn off the cycling of the timer every time they are used. pinModeFast does nothing to the cycling of the timer. This is the mode for maximum compatibility.

However there is a comment in the code for the standard commands that suggests it would be more efficient to turn off the timer in pinMode and not in the other 2 commands. This makes enormous sense to me; in many instances pinMode might be used just in setup() and so the extra overhead could be dispensed with.

I was reluctant to completely split with the maximum compatibility that more experienced developers had opted for. So I did it both ways. digitalWriteFast2 and digitalReadFast2 don't turn off the timer. pinModeFast2 does turn off the timer.

Download   http://healthriskappraisal.org/dwf.zip
The folder called digitalWriteFast, containing digitalWriteFast.h and a keyword file goes in your library. A program called digitalWriteFastTest.pde which is what I used to test it on my mega. If you want to test it for a smaller board, you should be able to delete a huge number of test cases. There is also a program called progprog.py; because these new commands deliver their speedup when the pin numbers are known at compile time I needed to test the commands with source code that specified the pin numbers directly. With 50-some pins to test I needed to generate most of the test program source code automatically. progprog.py generates that test code. You would need something like it if you want to test on a seeduino Mega, for example.

To use it just put the digitalWriteFast folder into your library alongside the EEPROM, Ethernet, Firmata, etc folders. In your source code
#include <digitalWriteFast.h>
then you can use digitalWriteFast(pin,HIGH/LOW), digitalReadFast(pin), pinMode(pin,INPUT/OUTPUT), digitalWriteFast2(pin,HIGH/LOW), digitalReadFast2(pin), pinMode2(pin,INPUT/OUTPUT) very much as you have used the built-in commands. The object code will not only be faster, but actually smaller as well.

The downside
Without the huge performance advantage motivating you to learn to use the PORT, DDR and PIN registers directly, you may not learn to take advantage of them. If you need to manipulate several adjacent pins at once you may be able to do it with one command and get another performance increase; learning to use those commands may also move you closer to learning the commands to get better control over the PWM pins for finer control of those.

Lastly, there is the likelihood that you will use one of these 'Fast' commands in a way that means the pin number is not known at compile time. This could mean that you think you're getting high performance when you are really not. It might be hard to find that issue.

Thanks
Paul Stoffregen did the heavy lifting on this and deserves most of the kudos. I wouldn't have made my small contribution without Westfw's patient and insightful coaching. Any problems are due to my shortcomings, not theirs.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: Fandu on Mar 03, 2010, 08:03 pm
Nicely done!  I will have to give this library a try and see just what kind of performance gains can be had with it.

You've provided a nice intermediate level library for those of us that haven't got around to working directly with the hardware yet.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: pjrc on Mar 04, 2010, 01:49 pm
Nice to see you're using these macros I wrote months ago.  At the time, David seemed interested to include them in Arduino.  At least that's what he said on the developer mail list.  I would not have done all that work had I known it would only sit in the issue tracker, rather than be included in 0018.  (likewise, many other optimizations I've explored won't been ported to Arduino and written up nicely, because if issue #140 can't be included, certainly other more difficult optimizations for non-const cases won't be).

Still, a library implementation is better than nothing.

Could I convince you to surround each definition with a check to avoid redefining if it already exists?  For example:

#if !defined(digitalPinToPortReg)
#define digitalPinToPortReg(P) \
(((P) >= 0 && (P) <= 7) ? &PORTD : (((P) >= 8 && (P) <= 13) ? &PORTB : &PORTC))
#endif

I realize this adds a lot of extra #if and #endif lines.  However, if these definitions ever become included in the Arduino core (where they rightly belong), your code will continue to work.

On 3rd party boards, this will allow those boards to define these macros for their different pin assignments, and your code will automatically use them.


Also, on Arduino Mega, issue #146 will apply to the pins which use registers beyond the range usable with the CBI and SBI instructions.  Then again, issue #146 applies to ALL usage of the normal digitalWrite() and pinMode() functions, regardless of register addresses.

If you care about issue #146, you could add a check if the pointer (cast to an integer) is greater than 32, and if so, surround the bitWrite with code to save interrupt context, disable interrupts, and then restore.  Because this is within the check for __builtin_constant_p(P), the compiler will only include that code for the appropriate pins.

So far, nobody seems to care much about issue #146.  Someday I'll get around to writing some test cases to demonstrate the problem.  Trust me, it is real.  The "nobody has ever complained in 5 years" is only because recently have widely used libraries and functions called digitalWrite() from interrupts, and because the problems are so very mysterious and difficult to debug.  Such is the way of race conditions.

Still, nice work on library-izing this.  Hope it gets some use.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Mar 06, 2010, 12:26 am
I will work on Paul's ideas and try to have an improved version posted by 3/10. This project wouldn't even have begun without his insight. As I said before he did all the hard work on this.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Mar 09, 2010, 02:31 am
I did update the version on the server tonight. I did not implement all of Paul's suggestions. I did surround all of the pin to port/timer type functions with a single #if, #endif that I hope will be adequate to the needs of 3rd party boards. I similarly surrounded each of the macros for pinModeFast,digitalWriteFast,digitalReadFast,pinModeFast2,digitalWriteFast2, digitalReadFast2 with #if #endif statements.

Issue #146 relates to interference between interrupt routines that contain digitalWrite instructions(eg Servo and Tone Libraries) and non-interrupt code. It is clearly a real issue and will be a difficult one to sort out for most people who encounter it. On the other hand, the point of these library macros is to improve efficiency.  I don't know that interrupts will be in use and don't know if digitalWrite will be used inside the interrupt routines in the code that uses these macros.

I gave real thought to trying to implement something that would be interrupt safe for one of the 2 versions. Perhaps as I gain proficiency I will return to this idea.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Mar 11, 2010, 12:44 pm
I bought a teensy++ from Paul. I haven't even gotten so far as to solder in headers, but I looked at some of the software that comes with the board to tie it to the Arduino IDE. Paul doesn't advertise that he's not only implemented THESE ideas as part of the digitalWrite etc commands there but he's ALSO greatly speeded up the way digitalWrite etc work when the pin number is not known at compile time. It looks to me like he's also implemented the code necessary to protect against the interrupt interference of issue 146; there is some conditionally included code that implies something like that. That would mean that his boards would work better with the servo and tone libraries than the branded Arduinos. The code itself combines complex macros and assembly language so that it is very tough reading.

All of which makes me think that the teensy++ may really be the board to go with. Its not just a tiny arduino clone but the software is actually enhanced in very important ways. I'm still trying to figure out how to tie something this small to a useful prototyping shield though.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: Coding Badly on Apr 18, 2010, 10:06 am

@jrraines: Would you mind publishing your code somewhere else?  The link you provided isn't working.


I suggest removing the check "__builtin_constant_p(V)" from digitalWriteFast...

#define digitalWriteFast(P, V) \
 if (__builtin_constant_p(P) [glow]&& __builtin_constant_p(V)[/glow]) { \
   if (digitalPinToTimer(P)) \
     bitClear(*digitalPinToTimer(P), digitalPinToTimerBit(P)); \
   bitWrite(*digitalPinToPortReg(P), digitalPinToBit(P), (V)); \
 } else { \
   digitalWrite((P), (V)); \
 }


Relatively speaking, the call to digitalWrite generates one machine instruction (a relative call).  After removing the built-in check on V and passing a variable into the macro...

For non-PWM pins, five machine instructions are generated.

For PWM pins, eight machine instructions are generated.

In my opinion, this is a small price to pay for the huge increase in speed.  If someone wants to save program space, they can call digitalWrite directly.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Apr 18, 2010, 02:46 pm
http://code.google.com/p/digitalwritefast/downloads/list

Mellis and Stoffregen felt that it was safer to include the PWM stuff. I thought the digitalWriteFast2, pinModeFast2 version was preferable. you can cut out a little overhead by using pinModeFast and digitalWriteFast2.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: Coding Badly on Apr 19, 2010, 04:23 am

Thanks!

Quote
Mellis and Stoffregen felt that it was safer to include the PWM stuff

Um ... I didn't suggest removing the PWM stuff.  My suggestion was to apply the fast version when V is a constant or a variable.

Quote
I thought the digitalWriteFast2, pinModeFast2 version was preferable. you can cut out a little overhead by using pinModeFast and digitalWriteFast2.

Looks good to me.  I like it.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: pjrc on Apr 19, 2010, 05:04 am
I agree, it's a good trade-off. In fact, that's exactly how I implemented it in the version that's inside Teensyduino.

The slow compiled code often requires almost that many instructions just to marshal the inputs into the required registers, when either isn't a compile time const.  If the surrounding code is complex, but doesn't call other functions, which can often be the case with digitalWrite, the saving in register allocation are also a big win.

But that's not "exactly" how I wrote it.  This "bitWrite" macro coding style really isn't my first choice.  Inside Teensyduino, I implemented this using a giant chain of if-else checks, where each performs the desired write.  While it's a lot longer, a LOT longer, it has the advantage of doing nothing if an illegal pin number is used.  With this macro version, if an illegal pin number is used, it will write to the last pin.  On all Arduino boards, that's an analog input, likely never configured as an output, so the effect would be activating the pullup resistor on that analog in... which could be pretty confusing if the poor, misguided user didn't realize some other unrelated code mistakenly wrote an illegal pin number.  For that reason, I've never been very happy with this style.

Back in November 2009, David said he intended to include this into the official Arduino core, and he preferred this macro style (I had posted a short example of the if-else way), so I wrote it this way for contribution to the official Arduino version.  Sadly, with 0018 gone by, and issue #140 not tagged for 0019 or 1.0, it seems unlikely these optimizations will ever become part of the official digitalWrite.  Had I known that then, I wouldn't have bothered to write these for the official Arduino boards.  I'm certainly not going to put any more effort into it now, other than pointing out this lack of checking for illegal pin number input.

Then again, erroneously writing to the last pin is a lot better than what the slow compiled digitalWrite does.  It will happily use the too-large pin number as an index to an array, reading whatever happens to be in memory after those tables, and use that data as a pointer and bitmask to write to someplace in memory!  Not good.

Then again, there's issue 146 & 170, which also seems unlikely to ever get fixed.

It makes me sad to see so little care and concern for code quality.  I think maybe it's time to turn off my notifications for this thread....
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Apr 24, 2010, 03:17 pm
I realized that one form of documentation I should have provided from the start was examples of what the disassembly code looks like. These examples were compiled for my Mega, so pin/port correspondence may not be the same as other boards, but it will give a better idea of what is generated:
Code: [Select]
// these are non-pwm pins whose port is below 0x100
pinModeFast(51,INPUT);
   5c4e:      22 98             cbi      0x04, 2      ; 4
digitalWriteFast(51,HIGH);
   5c50:      2a 9a             sbi      0x05, 2      ; 5
pinModeFast(50,OUTPUT);
   5c52:      23 9a             sbi      0x04, 3      ; 4
digitalWriteFast(50,LOW);
   5c54:      2b 98             cbi      0x05, 3      ; 5

//these pins are on a port above 0x100
pinModeFast2(48,INPUT);
   5bb2:      80 91 0a 01       lds      r24, 0x010A
   5bb6:      8d 7f             andi      r24, 0xFD      ; 253
   5bb8:      80 93 0a 01       sts      0x010A, r24
digitalWriteFast2(48,LOW);
   5bbc:      80 91 0b 01       lds      r24, 0x010B
   5bc0:      8d 7f             andi      r24, 0xFD      ; 253
   5bc2:      80 93 0b 01       sts      0x010B, r24
pinModeFast2(49,OUTPUT);
   5bc6:      80 91 0a 01       lds      r24, 0x010A
   5bca:      81 60             ori      r24, 0x01      ; 1
   5bcc:      80 93 0a 01       sts      0x010A, r24
digitalWriteFast2(49,HIGH);
   5bd0:      80 91 0b 01       lds      r24, 0x010B
   5bd4:      81 60             ori      r24, 0x01      ; 1
   5bd6:      80 93 0b 01       sts      0x010B, r24


//these are pwm with a port address below 0x100:
pinModeFast(2,INPUT);
    32c:      6c 98             cbi      0x0d, 4      ; 13
digitalWriteFast(2,HIGH);
    32e:      80 91 90 00       lds      r24, 0x0090
    332:      8f 7d             andi      r24, 0xDF      ; 223
    334:      80 93 90 00       sts      0x0090, r24
    338:      74 9a             sbi      0x0e, 4      ; 14
pinModeFast(5,OUTPUT);
    33a:      6b 9a             sbi      0x0d, 3      ; 13
digitalWriteFast(5,LOW);
    33c:      80 91 90 00       lds      r24, 0x0090
    340:      8f 77             andi      r24, 0x7F      ; 127
    342:      80 93 90 00       sts      0x0090, r24
    346:      73 98             cbi      0x0e, 3      ; 14

pinModeFast2(2,INPUT);
    908:      80 91 90 00       lds      r24, 0x0090
    90c:      8f 7d             andi      r24, 0xDF      ; 223
    90e:      80 93 90 00       sts      0x0090, r24
    912:      6c 98             cbi      0x0d, 4      ; 13
digitalWriteFast2(2,LOW);
    914:      74 98             cbi      0x0e, 4      ; 14
pinModeFast2(5,OUTPUT);
    916:      80 91 90 00       lds      r24, 0x0090
    91a:      8f 77             andi      r24, 0x7F      ; 127
    91c:      80 93 90 00       sts      0x0090, r24
    920:      6b 9a             sbi      0x0d, 3      ; 13
digitalWriteFast2(5,HIGH);
    922:      73 9a             sbi      0x0e, 3      ; 14

//some of these have an address above 0x100:
pinModeFast(12,INPUT);
   247a:      26 98             cbi      0x04, 6      ; 4
digitalWriteFast(12,HIGH);
   247c:      80 91 80 00       lds      r24, 0x0080
   2480:      8f 7d             andi      r24, 0xDF      ; 223
   2482:      80 93 80 00       sts      0x0080, r24
   2486:      2e 9a             sbi      0x05, 6      ; 5
pinModeFast(9,OUTPUT);
   2488:      80 91 01 01       lds      r24, 0x0101
   248c:      80 64             ori      r24, 0x40      ; 64
   248e:      80 93 01 01       sts      0x0101, r24
digitalWriteFast(9,LOW);
   2492:      80 91 b0 00       lds      r24, 0x00B0
   2496:      8f 7d             andi      r24, 0xDF      ; 223
   2498:      80 93 b0 00       sts      0x00B0, r24
   249c:      80 91 02 01       lds      r24, 0x0102
   24a0:      8f 7b             andi      r24, 0xBF      ; 191
   24a2:      80 93 02 01       sts      0x0102, r24

Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: Coding Badly on Apr 26, 2010, 07:58 pm
@jrraines: Thank you.  That's helpful.

@Paul Stoffregen:

Quote
Inside Teensyduino, I implemented this using a giant chain of if-else checks, where each performs the desired write.  While it's a lot longer, a LOT longer, it has the advantage of doing nothing if an illegal pin number is used.

I prefer the chain for a different reason.  With some clever formatting, it can be made to look like a table.  Ensuring the Arduino-pin to port-pin mapping is accurate is easy.

Quote
Back in November 2009, David said he intended to include this into the official Arduino core, and he preferred this macro style (I had posted a short example of the if-else way), so I wrote it this way for contribution to the official Arduino version.  Sadly, with 0018 gone by, and issue #140 not tagged for 0019 or 1.0, it seems unlikely these optimizations will ever become part of the official digitalWrite.  

That's unfortunate.  I REALLY like these digital*Fast functions...

Quote
Had I known that then, I wouldn't have bothered to write these for the official Arduino boards.  I'm certainly not going to put any more effort into it now, other than pointing out this lack of checking for illegal pin number input.

I definately appreciate your and jrraines effort.  I'm trying to squeeze an application onto a 2313.  I've determined that without these functions, I would have had to resort to port manipulation.  

Who can't love these things!  High level function calls reduced to a single machine instruction!  It's the best of hand-assembly and C++.

Quote
Then again, erroneously writing to the last pin is a lot better than what the slow compiled digitalWrite does.  It will happily use the too-large pin number as an index to an array, reading whatever happens to be in memory after those tables, and use that data as a pointer and bitmask to write to someplace in memory!  Not good.

That is a bit unnerving.

Quote
It makes me sad to see so little care and concern for code quality.  I think maybe it's time to turn off my notifications for this thread....

PLEASE stay with us (me)!
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: wimleers on May 13, 2010, 09:12 pm
I'm only an Arduino beginner, but I'm not sure I understand why it is that revised functionality with identical behavior as before (thus no API change) yet improved performance are not getting committed?

What am I missing?
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on May 14, 2010, 12:00 am
I think that is Paul's point. In many situations this will be both faster and also smaller.

I have encountered at least one situation where I removed delays knowing that digitalWrite was slow and doubted the code would work if digitalWrite was speeded up. That is the logic behind calling the functions by different names.



Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: wimleers on May 14, 2010, 01:33 am
Thanks for your reply, jrraines!

Also, only now, after looking in more detail at DigitalWriteFast (I also hadn't seen the excellent and concise description at http://code.google.com/p/digitalwritefast/ yet), I understand how foolish my question was. DigitalWriteFast requires pin numbers to be known at compile time. Period.
i.e. digitalWrite(9, HIGH) can be sped up, digitalWrite(i, HIGH) can't.

Likely this will be obvious right away to most readers of this topic, but to me, as a novice, it wasn't.

Thanks for your time! :)
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on May 14, 2010, 02:18 am
Westfw somewhere remarked that consistent speed of digitalWrite may be desirable. Another consideration.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: westfw on May 14, 2010, 04:50 pm
Quote
why it is that revised functionality with identical behavior as before (thus no API change) yet improved performance are not getting committed?

Part of the problem is in EXPLAINING the new functions.  None of the Arduino functions currently include information about execution speed, and for most applications it isn't important.  Far slower systems  have existed and solved problems.  Now suddenly we want to add "fast" versions, and the possibility that they will introduce confusion is rather high.  "when do I need the fast function?  Do I need fastSerial.Print too?  fastAnalogRead?  I changed blink to use the fast functions and it's still blinking once per second?"

Quote
DigitalWriteFast requires pin numbers to be known at compile time. Period.
i.e. digitalWrite(9, HIGH) can be sped up, digitalWrite(i, HIGH) can't.

Those are two different statements.  DigitalWriteFast is quite careful to "work" with pin numbers that are variables.  It just won't be any faster.  (adds more confusion, you know.
"Isn't the pin number a constant in the blink example?  How come sometimes it runs fast and sometimes it runs slowly?")
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: wimleers on May 14, 2010, 04:52 pm
Fully agreed.

Your explanations are far more exact and unambiguous. That's why I'm a novice and you're an expert :)
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: AlphaBeta on May 14, 2010, 09:13 pm
Have you tested digitalWriteFast(i++); (assuming i is a defined variable) ?
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on May 15, 2010, 04:21 am
I don't think I'd used that specific syntax. I just ran a simple example and it seems to give the correct result. Have you had a problem?
Code: [Select]
uint8_t i=18;
digitalWriteFast(i++,HIGH);
lcd(0,0)<<"19 = "<<(long)i;
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: pjrc on May 15, 2010, 05:02 am
Quote
Part of the problem is in EXPLAINING the new functions.


By that logic, no improvements would ever be added.

David Mellis specifically requested this code, and when I wrote it for normal Arduino boards, he specifically requested it implemented for the Arduino Mega, which I also did within a matter of a couple days.  Difficulty of documentation was never a concern when he (and others on the developer list) wanted this, back in November 2009.

Since then, it's sat in the issue tracker for about half a year.  However, he did recently flag this for the 1.0 milestone, so maybe it'll actually make it into the official Arduino core within the next 6 months?

Still not known is if this code will simply be used as digitalWrite(), or if a new name like digitalWriteFast() will be used, or if David will end up implementing it some other way.  However it David ends up using this, assuming he ever does, I'm sure once it's actually committed to svn and due to be released, somehow explaining/documenting it really won't be a big deal, and if it doesn't introduce a new name, perhaps no documentation changes will be needed at all?

I've heard this "but we'd have to document it and support it" line many, many times before, usually in the corporate world by mid level managers who just don't want to do anything innovative, unless the directive comes from those above them.  There is some point to it for dramatically new products, but really, in cases like this where the feature is just a performance improvement that carries virtually no risk, virtually no backwards compatibility, and is pretty much just invisible, I just don't buy that line about how difficult documentation is.  It'd probably take less time than we've sent writing all these message in this thread!
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on May 15, 2010, 12:24 pm
I will take some of the responsibility for all this 'difficulty explaining' discussion; I think my writeup was not well done. That was partly because I'd understood the remarks from others who wondered 'why bother when using PORT etc directly could give better efficiency at run time'.

Trying to acknowledge the truth of that but point out the simplicity and ease of use of using this led to something even more circuitous than this post.

To slightly change the subject, the other thing I'd have thought would warranted prompt adoption into the core is Streaming.h--I'd have thought it should just have been added to print.h. But both David Mellis and Mikal Hart seemed in agreement that it made sense to leave it out. And I will say that leaving it out led to further improvements--Mikal Hart changing the crux of it from 7 lines to one amazingly functional line of code and Michael Margolis and others adding support for HEX etc. That development might have been impeded if it were in the core.

So there are concerns about who maintains, extends and can commit code. With digitalWriteFast there is the concern about who will revise the macros when a board more complex than the Mega comes out. When that happens I will certainly work on it; I would expect it would stretch my ability to write macros quite considerably. As I have acknowledged before I would not have gotten my small contribution to this done without building on Paul's work and without Westfw's patient guidance.

Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on May 15, 2010, 01:28 pm
Speaking of my limited abilities with macros, Paul pointed outthe issue with pin numbers that are too high a while back. There is a #error "Your error message here." macro. I can make it work with #ifdef, but I spent an hour or two playing with how to make it work for a numeric issue and had no success. Any tips would be appreciated.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: mellis on May 16, 2010, 02:41 pm
The reason I've been holding off on including this is the need for deciding whether it's an optimization to the existing digitalWrite() or an additional function.  This seemed like something that made sense to target for the 1.0 release, but probably could have been (could be) resolved sooner.  

In particular, I think we should simply remove the checking for and disabling of PWM output from the digitalWrite() function and use this optimized version.  Then we will have, if I understand it, a one-instruction digitalWrite() for cases where the pin number and value are a compile-time constant.  Anything slower than that seems like it might suggest an alternative (faster) option, which I think would be an unnecessary complication.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: mem on May 16, 2010, 04:08 pm
Mellis,  I don't think removing the PWM check  from the current digitalWrite will provide anything close to the performance of Paul's one-instruction write macro.

Removing the code to check and disable PWM would speed things up by around 40% on a PWM pin and 20% on a pin that does not use PWM.

Here are timings for a pulse (digitalWrite HIGH followed by digitalWrite LOW) measured using a logic analyzer on a 16MHz Arduino:

digitalWrite on pin 3 (PWM pin) - 4.8 us
digitalWrite on pin 4 (not PWM) - 3.5 us
modified digitalWrite  with no PWM check on pin 3 (PWM pin)  is 2.9 us

But using Pauls code (without a PWM check) would run around 40 times faster - 125 nanoseconds assuming that the pin was a constant.

I my opinion, most applications are fast enough with the existing digitalWrite, but applications that do need to minimise delays are best off with Paul's on-instruction macros. My vote is to give the fast write macros a different name so legacy code that relied on the performance of the current digitalWrite would not change significantly.  And the go-faster macros could be chosen for those applications that really needed the high performance.

I do hope Paul's macros make it into 1.0 as this will be a significant benefit to users that need high performance I/O.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: pjrc on May 16, 2010, 07:11 pm
When this was discussed on the developer list, as I recall, the decision was to decouple the API change (whether to disable PWM or not) from the optimization.  I was specifically asked to incorporate the PWM disable into the macro, which I did within a day.

The plan was to quickly include the optimization with PWM disable, because it would have little impact other than simply making sketches run faster.  The API change, which turned out to be controversial with Tom and others, was to be considered separately.  If PWM disable was to be removed at some future date, it would be easy to simply delete it from both the optimized macro and original compiled code.

Disabling PWM takes 3 more instructions with the macro, which execute in 312 ns on a 16 MHz chip.  Even with the PWM disable, the const case optimization executes in 437 ns, which is still about 10X faster.

Also, I'd like to point out this macro approach is not my preferred coding style for this optimization.  I did it this way because David said he didn't like the verbose inline function approach I originally posted.  Indeed that way is much larger, but it has the advantage of not touching any pin when an illegal pin number is given as the input.  This macro, as currently implemented, will always write to some pin even if the pin number is too large.  Then again, that's a lot better than the terrible behavior of the compiled code with a too-large pin number!


edit:  on Arduino Mega, this macro will suffer the issue #146 bug on the pins which aren't memory mapped in the lower 16 I/O space.  Perhaps the bitWrite macro could be modified to disable interrupts in those cases, but really this sort of thing is much easier to deal with in an expanded static inline function.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on May 17, 2010, 12:26 pm
I'm not sure i belong in this part of the discussion at all, since this is an area I feel very much a novice in, but maybe that perspective is valuable.

1. Sometimes figuring out how you would explain something lets you choose between interface alternatives. Particularly with different boards having different pin-port correspondence this is not going to be trivial to explain.

2. I realize that analogWrite and digitalWrite can both be used to control an LED, but I have no idea with what interfaced device one would really CARE about turning off the timer; that is: What would you plug into a pin and sometimes need the analogWrite fcn and later NEED the timer completely off? Would a novice ever stumble into using a device in that class? After all, the bootloader leaves the timers off, and (i think) the reset/power up sequence leaves the timers off.

3. I supplied an alternative syntax that turned the timer off with pinModeFast2. One other simple syntax to consider would be to turn the timer off with analogWrite(0) and analogWrite(255) (or whatever the max is).

4. Users who are sophisticated enough to control the timer using the port commands don't really need separate support here, but changing the interface might break their code, and prevent novices from building on posted code. As a not-quite example, frequency2Timer (I may not have the name precisely right) doesn't work with the Mega; sketches that built on that require more learning/research to implement than a novice may want/be able to do.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: pjrc on May 17, 2010, 06:19 pm
This talk about "PWM disable" in digitalWrite isn't about actually disabling the timers.  It is merely about clearing a bit which causes the timer's PWM generation to essentially "take over" a pin, if it's configured as an output.

Obviously analogWrite() sets that bit, and it also make the pin an output.  Until that bit is cleared, it doesn't matter if the pin is configured as high or low the normal way (writing to the PORTx register), because it's being controlled by the PWM waveform generator in the timer.

Currently, digitalWrite() clears that bit.  This is nice, because if a pin were previously "written" using analogWrite(), without clearing this bit, digitalWrite() would have no effect.  That might be pretty confusing.  Of course, the clearing of that bit is pretty expensive in terms of speed, as it takes extra time, and it even burdens the non-PWM-capable pins with a check to know if the pin is PWM capable.

If digitalWrite() didn't clear that bit, then some other way would be needed, and every discussion I've heard on this issue seems to suggest using pinMode().  That ends up raising a lot more questions, like if a separate mode for PWM would be defined, or if the user would just configure it to OUTPUT mode again, even though they already had previously configured it to OUTPUT mode?

Ultimately, there are at least 2 important issues:

1: Is the digitalWrite, analogWrite, pinMode API a "thin" layer that is designed for performance, yet exposes hardware details?  Or is it more "opaque", providing novices with conceptual simplicity and fully insulating them from painful hardware quirks?  It's a classic performance vs usability trade-off.

2: Either way, what should the API really be?  If that's anything other than what it currently is, would short term pain of breaking compatibility be worthwhile for long term performance or usability gains?

Today, regarding #1, I believe the API somewhere in between, achieving neither great performance nor excellent usability.  We've all heard about the performance issues, and removing the PWM disable is intended to improve speed.

Usability-wise, even though digitalWrite handles this PWM disable, it fails to insulate users from other hardware quirks, mainly the pullup resistors (which I believe is pretty horrible for novices, since they get tricked into thinking they have a "hardware problem" when their code seems to be working, but the pin isn't able to do much more than drive a multimeter, where correct voltage checks further reinforce the notion their hardware is damaged, all because they forgot pinMode OUTPUT).  digitalWrite() also doesn't do a bounds check on the pin number, so novices aren't protected from mistakes like a loop clearing all pins being "off by one" or "way off".  You might also wonder why analogWrite switches the pin to OUTPUT mode automatically, but digitalWrite doesn't?  There's also bug #146, which is rare, but extremely difficult when it does strike.  digitalWrite() seems simple and easy to use, but it leaves plenty of "gotchas" for unsuspecting users.

People who teach classes and workshops have a great interest in the simplicity of the API, and having a coherent model free of low-level hardware quirks.  People who built robots and other more complex projects care deeply about performance.

So this all gets into a lot of talk about the conceptual model this API is supposed to provide for users.  Unfortunately, it's probably far too late to really throughly consider that model, because many thousands of people have written mountains of code that depends on the current behavior.

But anyway, that's what's meant by this "PWM disable" stuff.  It's not about actually reconfiguring the timer (and each timer generates PWM for 2 or 3 pins, so you wouldn't do that anyway for a single pin).  These issues about the API, conceptual/teaching models, performance and backwards compatibility are quite difficult.  So far, the only easy answer has been to put off making any decisions.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: retrolefty on May 17, 2010, 07:33 pm
Quote
So far, the only easy answer has been to put off making any decisions.


That sound just like the U.S. Congress model.  ;)

Lefty
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: mellis on May 17, 2010, 09:20 pm
Paul's summary is right on and I apologize for putting off so many of the decisions.

If we decide to remove the PWM checks from digitalWrite(), I would suggest adding a noAnalogWrite(pin) function which stop the PWM generation on the pin.  

The thought was that if this change were to happen, it would be best to do in Arduino 1.0.  In which case, it seems annoying to change the performance of digitalWrite() twice in quick succession: once for 0019 (optimizing it as Paul has implemented but keeping the PWM check) and once for 1.0 (removing the PWM check).  But maybe the performance benefits outweigh the pain of having to tweak your code twice?
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: Coding Badly on May 17, 2010, 09:55 pm
Quote
I would suggest adding a noAnalogWrite(pin) function which stop the PWM generation on the pin

"noAnalogWrite" implies that analog writes will never be performed on that pin or will be permanently turned off.  I suggest "stopAnalogWrite".
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: pjrc on May 17, 2010, 10:50 pm
These really are difficult decisions.

While I've put a LOT of work into improving performance, really I do believe the value of this API is simplicity for novice users.  Obviously Arduino has been very successful at presenting an I/O model novices can understand.

I don't do a lot of teaching, but I do get called upon when users have difficult problems, especially strange ones nobody else has managed to solve (I do have about 20 years experience designing and debugging embedded designs in assembly language).  I've seen first hand where users are empowered by the simplicity these functions offer, but then get terribly frustrated when they fail to deliver that simplicity with those hardware quirks I mentioned above.  I really think solving those problems, even if it means sacrificing performance or breaking backwards compatibility, will in the long run really benefit everyone.

Of course, I love my code to run fast, and when I truly care about performance, I just write in assembly.  But I don't advocate teaching novices assembly, or even quirks of the hardware's register-level access.  Hopefully those quirks that are currently exposed by the API can be addressed.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: bperrybap on May 22, 2010, 12:53 am
Adding all these parallel API functions seems a bit silly to me.

As has been mentioned before in these types of discussions, why can't the API be layered?
Rather than go off and invent an entire new slew of functions?

In other words for something like a digitalWrite().
digitalWrite() is the top layer and works just like it does today with all the handholding which does cost performance.
It does the timer code stuff and then calls _digitalWrite()

_digitalWrite() drops the next level and eliminates the checks and gets a bit faster and is really fast (single instruction) for constant arguments. If arguments are not constants it calls __digitalWrite()

__digitalWrite() is the bottom layer which gets called when the arguments are not constants.

That way existing code works just as it does today except much faster when constants are used. Users that are more knowledgeable and don't need any handholding can call _digitalWrite() to pick up additional performance especially if the arguments are constants to get single instruction bit sets/clears.


With a combination of macros and/or inline functions you can get everything with no additional overhead. And it is very consistent, backward compatible and easy to document since each layer takes the same arguments.

So my question is why go off and create all these parallel functions, when a simple more traditional layered API gets you there as well?

--- bill





Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on May 22, 2010, 04:23 am
I think that's more or less what is planned in Arduino 1.0. I agree that is a better approach. At the time I turned Paul's ideas into a header file, it was not clear when, if, or how it would be implemented in that way and this seemed a way to get all of that functionality and similar syntax right away.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: fat16lib on Jul 13, 2010, 05:05 pm
I have written several libraries with optimized I/O for software SPI and I2C.  Here is what I would like in fast digital read/write for this type of application.

The behavior of digitalRead/digitalWrite should not be modified to provide fast read/write.  Fast read/write should be a separate facility.

Fast digital read/write should not clear PWM since this increases the execution time by five cycles on a PWM pin.  Fast write will then executes in two cycles (more on some Mega pins) when both arguments are constant.  A stopAnalogWrite() function should be added.

Fast digital read/write should fail at compile time if the pin number is not a constant or is too large.  This will prevent unexpected behavior at execute time.

Here is an example include file for the 168/328 that shows how this can be implemented with static inline functions and a static const array.  The array is always eliminated by compiler optimization.

It is easy to extend this idea to the Mega and other Arduino-like boards.
Code: [Select]
#ifndef DigitalFast_h
#define DigitalFast_h
#include <avr/io.h>

struct pin_map_t {
 volatile uint8_t* ddr;
 volatile uint8_t* pin;
 volatile uint8_t* port;
 uint8_t bit;
};

static const pin_map_t pinMap[] = {
 {&DDRD, &PIND, &PORTD, 0},  // D0  0
 {&DDRD, &PIND, &PORTD, 1},  // D1  1
 {&DDRD, &PIND, &PORTD, 2},  // D2  2
 {&DDRD, &PIND, &PORTD, 3},  // D3  3
 {&DDRD, &PIND, &PORTD, 4},  // D4  4
 {&DDRD, &PIND, &PORTD, 5},  // D5  5
 {&DDRD, &PIND, &PORTD, 6},  // D6  6
 {&DDRD, &PIND, &PORTD, 7},  // D7  7
 {&DDRB, &PINB, &PORTB, 0},  // B0  8
 {&DDRB, &PINB, &PORTB, 1},  // B1  9
 {&DDRB, &PINB, &PORTB, 2},  // B2 10
 {&DDRB, &PINB, &PORTB, 3},  // B3 11
 {&DDRB, &PINB, &PORTB, 4},  // B4 12
 {&DDRB, &PINB, &PORTB, 5},  // B5 13
 {&DDRC, &PINC, &PORTC, 0},  // C0 14
 {&DDRC, &PINC, &PORTC, 1},  // C1 15
 {&DDRC, &PINC, &PORTC, 2},  // C2 16
 {&DDRC, &PINC, &PORTC, 3},  // C3 17
 {&DDRC, &PINC, &PORTC, 4},  // C4 18
 {&DDRC, &PINC, &PORTC, 5}   // C5 19
};
static const uint8_t pinCount = sizeof(pinMap)/sizeof(pin_map_t);

static inline uint8_t badPinNumber(void)
__attribute__((error("Pin number is too large or not a constant")));

static inline __attribute__((always_inline))
 uint8_t digitalReadFast(uint8_t pin) {
 if (__builtin_constant_p(pin) && pin < pinCount) {
   return (*pinMap[pin].pin >> pinMap[pin].bit) & 1;
 } else {
   return badPinNumber();
 }
}

static inline __attribute__((always_inline))
 void digitalWriteFast(uint8_t pin, uint8_t value) {
 if (__builtin_constant_p(pin) && pin < pinCount) {
   if (value) {
     *pinMap[pin].port |= 1 << pinMap[pin].bit;
   } else {
     *pinMap[pin].port &= ~(1 << pinMap[pin].bit);
   }
 } else {
   badPinNumber();
 }
}
#endif  // DigitalFast_h
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: bperrybap on Jul 21, 2010, 03:57 am
Quote
Fast read/write should be a separate facility.


Why? If done "correctly", it should "just work" without the user having to do anything special.

There should be no reason that the user should have specify a separate "fast" api for use with constants to get faster/better code.
Redoing the existing API code implementation rather than creating a new API also allows all the existing code that uses constants to enjoy better performance by simply being recompiled with the new library code.

A separate "fast" API is merely pushing work onto the programmer that can easily be done by the compiler at compile time and creates future maintenance headaches.

It is not that difficult to wrap smart macros and inline functions around the existing API (not necessarily existing code) so that in the end the user gets better/faster code automagically if he uses constants and falls back to a slower implementation for non constants.
In fact there are ways to handle non constants that are  faster and generate less code than the current code in the standard arduino library that are fully backward compatible with the current API.



On a side note, be very carful about ensuring atomic register access
because even when using constants, you don't always
get CBI/SBI instructions.
For example:

Code: [Select]
*regptr |= bitval;
*regptr &= ~bitval;


Does not always generate SBI/CBI single instructions.
Which is very important to ensure ATOMIC access.
Some AVRs have registers that are beyond the SBI/CBI range.
For those it is several instructions to set or clear a bit and nothing
can be done to reduce it.
But this too can be delt with to ensure ATOMICITY in smart macros & inline code such that
for those registers that are beyond the CBI/SBI range, additional code is inserted so that interrupts are masked and restored around the bit set/clear operation.


--- bill
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: mem on Jul 21, 2010, 07:52 am
Quote
There should be no reason that the user should have specify a separate "fast" api for use with constants to get faster/better code.


There are cases where changing the I/O performance or behavior (for example, not turning off PWM) could break legacy sketches.

I have seen sketches and libraries that rely on the relatively slow performance of the current digitalWrite implementation for providing  timing delays in interface code. It's not good practice, but not point in penalizing users of that code.

Having a separate call for the faster functions would ensure that legacy code behavior would be unchanged. Users that needed faster performance with existing code could use the IDE to find and replace digitalWrite with the faster version.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: bperrybap on Jul 22, 2010, 09:17 am
Quote

There are cases where changing the I/O performance or behavior (for example, not turning off PWM) could break legacy sketches.

I have seen sketches and libraries that rely on the relatively slow performance of the current digitalWrite implementation for providing  timing delays in interface code. It's not good practice, but not point in penalizing users of that code.[There are cases where changing the I/O performance or behaviour (for example, not turning off PWM) could break legacy sketches.

I have seen sketches and libraries that rely on the relatively slow performance of the current digitalWrite implementation for providing  timing delays in interface code. It's not good practice, but not point in penalizing users of that code.


Yes I've also seen fragile/broken code that sometimes unknowingly takes advantage of certain code timings.
I'm all about attempting to preserve backward compatibility. But
developers that expect or need this are eventually going to get burned as eventually this simply can't always be done across all releases.

Trying to preserve that level of stuff is very difficult and essentually brings a toolset to a halt as any change will alter behaviors especially timing.

But even if this type of behaviour support is a requirement, then I think there are better ways to handle it than to pollute the API space with a flood of ever increasing new function calls.

A better way to handle this is to require those users that need to have this old behaviour preserved do something to request it, rather than require every one else to have to do something to request the better/new behaviour because as time goes by, those that write proper code are constantly being punished by having to update their code to use an ever changing set of API names.

So my strong suggestion would be not expand the APIs with new names for the same functions but simply write the new code such that it requires users that need to preserve an old behaviour set a #define at the top of their sketch or perhaps include some sort of deprecated or backward compatibility header file in order to get the previous existing behaviour that allows their broken code to continue to function in their specific environment.
That way, the existing API is preserved for everyone and only the users that have issues with the new improved version of the code have to do anything. All those that write proper code get the new enhancements "for free".

But also, as in my previous posts, I distinguish between timing and behaviour. So for different leaned out functionality I'd rather see a layered set of API calls, so that those that want the lean and mean versions of the functionality with no hand holding (which may be faster but is different from the current behaviour) could call the _ (underbar) version of the functions of the same functions.


--- bill
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: mem on Jul 22, 2010, 09:43 am
Quote
But developers that expect or need this ?

If Arduino was targeted to developers then I would agree with your points. However, most Arduino users are not software engineers.  

Code that expects PWM to be turned off when digitalWrite is executed is not broken if that's the way it was intended to work.

I wouldn't have designed it that way, but that's the way it is.
IMO, changing the behavior, even if it's documented is likely to create problems that can be avoided by requiring the user to do something explicit to get the new improved behavior.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: fat16lib on Jul 22, 2010, 04:46 pm
Changing digitalWrite will break a lot of code.  The fastest form of this
Code: [Select]

 digitalWrite(13, HIGH);
 digitalWrite(13, LOW);

results in a 125 ns pulse on a 328.  

The pulse is about 30 times longer with digitalWrite in Arduino 018.

The value of Arduino hardware/software is ease of use for the novice.  Much of that is due to existing examples, articles, and books.  It is not worth breaking these for a few developers.

I suspect many developers wouldn't use the fast version.  

For example I wouldn't use it in my library for playing audio on the Adafruit Wave Shield.  The hardware SPI is used to read from an SD card so it uses bit-bang SPI in a timer ISR to drive the DAC.  It must send 88.4 KB/sec to the DAC for 44.2 ksps Wave files.  I just want total control over optimizing  this small but critical part of the library so I would not use a new fast digitalWrite.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: bperrybap on Jul 22, 2010, 10:06 pm
I think you guys are missing my points.
And by the way when I say "developer" I mean anybody writing
Arduino code not necessarily what someone might call a true/real programmer.

The main point I'm saying is don't pollute the API name space with an entire set of new API functions.
Simply layer the API in 3 functionoal layers (which is what my original suggestion was) and tack on under bars "_" to get the names for the additional layers rather than dream up cutsie new names for functions.

Functionality  of a given API call is independent of  its timing. Often additional functionality impacts timing but this is not always the case.

Those that want to delve into higher performance alternatives to the existing functions which do not offer all the handholding should have a way to do that.
Those that want want the additional hand holding  functionality of the currently defined API can still have it.
And those what want to ride any performance increases due to new and better library code being written without them having to modify any of their code should also be able to that as well.

Again, functionality of a library API call is independent of timing.
(well except for time delay API functions)
In some cases it is possible to preserve all the functionality yet make it much faster but in some cases it is not.

Consider the very recent Arduino library fix for atomic bit access. This is now in the mainline code and will be in the next release. This will slow down all digital pin operations.
What about that type of change? Is it also unacceptable to slow down an API call?

API functions should only offer to maintain the functionality defined, and if timing is not part of the API specification, then timing should not be a limiting factor in the implementation.


I also think that since it is possible to offer a  higher performance interface with the very same API inteface and functionality by simply re-writing the underlying digital pin code, that the API library code should be allowed to be updated to take advantage of this rather than being bloated up with new API functions that are exactly same only with different timing.

And that users of Arduino need to understand that they cannot depend on certain timing behaviours being maintained across all releases.  Those that need such exact timing behaviours of what they have in a particular release can simply stay with that given release and not upgrade.

This is no different than what currently happens in the commercial world on real products. Once you have a working product in the field you sometimes have to lock down toolsets. It is simply not always possible to upgrade to the latest greatest tools release even if you want to.


Quote

I suspect many developers wouldn't use the fast version.  

For example I wouldn't use it in my library for playing audio on the Adafruit Wave Shield.  The hardware SPI is used to read from an SD card so it uses bit-bang SPI in a timer ISR to drive the DAC.  It must send 88.4 KB/sec to the DAC for 44.2 ksps Wave files.  I just want total control over optimizing  this small but critical part of the library so I would not use a new fast digitalWrite.


Now this depends on how it is presented.
If it were faster by default (preserving the same behaviour), then most people would use the "faster" i/o (maybe not the fastest as that might take using an _ (underbar) function with less functionality)
and those that couldn't handle "faster" would go in and update their
code (which might be as simple as a define or include) to revert back to the existing bloated code and slower behaviour - assuming there way to re-enable that code.

The point being nobody really knows what percentage of users actually need or depend on the current slow Arduino digitial pin code implementation. My belief is that this is a very small minority and the vast majority of the users would actually see little or no difference and a small fraction would be happy with the increased speed.

It seems so odd to be defending a poor code implementation that is slow and bloated.
Normally people are happy when new versions of code
could get faster or smaller.


--- bill

Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: fat16lib on Jul 23, 2010, 02:56 am
It seems silly to claim that _(under bar) makes it ok to triple the number of names when they imply different functionality/performance.  

Does the under bar rule allow changes in functionality?  If not, you are stuck with the PWM problem and a lot of new under bar names.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: pjrc on Jul 23, 2010, 03:15 am
Quote
The point being nobody really knows what percentage of users actually need or depend on the current slow Arduino digitial pin code implementation.


I don't know this exact percentage, but I can tell you Teensy has used these optimizations for several months, with not a single case reported where faster digitalWrite() broke anything.  (virtually all problems are hard-coded assumptions about the timers, usually which pins correspond to them)  Of course, there are far fewer people using Teensy than Arduino.  But many of the major Arduino libraries and lots of random code has been tested on Teensy over the last year.  Teensy's far faster I/O simply has not been a compatibility issue.

The optimizations used in Teensy preserve the PWM disable functionality and aim to be exactly the same functionality as regular Arduino, only faster.  No new naming was added.

Regarding API naming conventions, I want to remain neutral.  My only comment here is that many people have now used these optimizations on Teensy, with no reported ill effects.  Indeed some people have noticed and appreciated the faster speed, even though it's not really advertised or documented.  Maybe I should do that??

Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: fat16lib on Jul 23, 2010, 04:47 am
I suspect Paul is right, few people will have problems with faster performance.

I did have a problem with a TC74 sensor when I optimized I2C bit-bang.  This sensor is limited to 100 kHz I2C.

Still it is hard to believe that a factor of thirty in execution speed of basic I/O functions will not cause problems.  That is why I designed my optimized functions to get a compile time error rather than revert to the slow functions.

I guess my experience in embedded control systems where predictable api execution time is important doesn't apply to the Arduino.

If Paul is correct then there should be only one digitalWrite and not several under bar variations.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Jul 23, 2010, 12:36 pm
Thinking about the ongoing discussion overnight:

the problem of testing against all the lilypad and arduino variants seems large. testing against all the libraries seems huge. testing against even a fraction of sketches seems insurmountable.

In terms of testing and reliability, it is one thing to write a library with a different name but quite another to rewrite wiring_digital and reuse the digitalWrite name.

I think that it is certainly not too soon for someone (bperrybap, fat16lib, your name in lights here,?) to write and post a variant of wiring_digital so that people can drop it in and we can get some community testing started. That is certainly a way to advance implementation of your ideas.

The plans for big changes in Arduino 1.0 have me a little concerned. it seems like various (worthwhile) API changes that will inevitably break existing sketches are all planned for that release. It would be good to have a long lead time on something as fundamental as a candidate for digitalWrite so that we know the new code's issues.

It might be good to get more specific direction from David Mellis before going too far, but I think it is not too soon to get a candidate posted.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: fat16lib on Jul 24, 2010, 06:44 pm
I agree with jrraines, guidance from David Mellis is key.  A vision/philosophy for the future of Arduino is needed.  

There needs to be a way to add new/improved functionality but keep the core simple and stable.

For example, UNIX added asynchronous read with a new API, aio_read(), and didn't change read().  Some bad ideas have been removed from UNIX/Linux and some bad ideas live on but a better version of the API has been added.

I suspect the following would not be acceptable to the core Arduino developers but it illustrates why guidance is needed.

If I were allowed add a new API I would use an class with inline functions high() and low() that run 5-6 times faster than digitalWrite().  Here is a prototype sketch to illustrate the idea.
Code: [Select]
#include "pins_arduino.h"

class PinWriter {
 volatile uint8_t* _reg;
 uint8_t _bit;
public:
 void high(void) {*_reg |= _bit;}
 void low(void) {*_reg &= ~_bit;}
 void init(uint8_t pin) {
   // clear PWM timer bit here in real version
   pinMode(pin, OUTPUT);
   _bit = digitalPinToBitMask(pin);
   uint8_t port = digitalPinToPort(pin);
   _reg = portOutputRegister(port);
 }
};

PinWriter pin;

uint8_t pinNumber = 13;

void setup(void) {
 pin.init(pinNumber);
}

void loop(void) {
 // make four pulses for scope
 pin.high();
 pin.low();
 pin.high();
 pin.low();
 pin.high();
 pin.low();  
 pin.high();
 pin.low();
 delay(1);
}

This class has a fairly constant execution time for examples I have tried.  About 12 cycles (750 ns) for high() and 13 cycles (810 ns) for low().  It does use a bit of RAM.

An optimized digitalWrite() facility for constant pin number would be available for advanced users the way aio_read() is available in UNIX.

To repeat, there are many possible choices for digitalWrite() so I think the principle Arduino developers need to provide guidance.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: pjrc on Jul 24, 2010, 07:01 pm
Please make sure any writing to pins is atomic.  Interrupts must be disabled during read-modify-write operations.  Please don't resurrect issue #146.

http://code.google.com/p/arduino/issues/detail?id=146
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: fat16lib on Jul 24, 2010, 07:54 pm
Paul I agree, the actual implementation should be atomic.  I just lifted code from 0018 to illustrate that there is a wide range of choices for improving/replacing digitalWrite().

Even with the extra instructions, an atomic version of this class is over four times faster than an atomic version of digitalWrite().

I believe there needs to be a clear statement of what the goal/requirement is for an improved digitalWrite()/digitalRead().  This would provide some constrains on proposed implementations.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: pjrc on Jul 24, 2010, 08:30 pm
I believe you have a very good point.

However, as a practical matter, every API discussion I've been involved in or watched as it developed ultimately came down to an arbitrary decision by David Mellis, or the discussion faded away and nothing happened.

I really do not want to get involved in more API discussions.

David certainly wants to optimize digitalWrite, since he asked me to write this, which I did (Ben Combee deserves credit for originally suggesting using __built_constant_p), and David flagged the code in issue #140 as a milestone for Arduino 1.0.

i have opinions about the API, but ultimately they don't matter.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: bperrybap on Jul 28, 2010, 01:49 am
Quote

I did have a problem with a TC74 sensor when I optimized I2C bit-bang.  This sensor is limited to 100 kHz I2C.

Still it is hard to believe that a factor of thirty in execution speed of basic I/O functions will not cause problems.  That is why I designed my optimized functions to get a compile time error rather than revert to the slow functions.


It isn't that faster library code can't potentially cause problems. It is that it only causes problems for code that is taking advantage of and depending on certain timing that is not guaranteed to be anything and can change from release to release since the API does not claim to offer any timing maximums or minimums.

In some situations that may acceptable or even required, but in those cases, the code may not be able to properly function on a newer release of the tools.

A better way to write portable library code that "just works". is to write any timing critical sections in assembly to avoid any potential timing issues. In the absence of that, then write C code that is known to generate a fixed sequence of known instructions with guaranteed timing.
(which is what happens when you can get CBI/SBI instructions).
Then,  for cases where specific timing minimums and hardware setup times must be ensured, use cycle delays to ensure that those hardware setup times are met.
Like in your case, you need to ensure that the signals don't toggle any faster than 100khz. There are portable ways to do this in C code that will work on any AVR at any clock rate. The latest GLCD library code that I wrote does exactly this.

This allows the library code to not only run on any AVR at any clock rate, but also get faster should the compiler or tools generate better code.

In the case of the GLCD library code, it steps WAY outside the arduino digital pin API routines to get the speed it needs. Now had the arduino libraries been written differently (same API just a different way), the GLCD library would have been able to use arduino supplied API functions rather than having to jump through hoops to avoid the arduino digital pin i/o library routines to get the needed i/o speed.


Quote

I guess my experience in embedded control systems where predictable api execution time is important doesn't apply to the Arduino.


Sure there are many times when predictable api execution times are mandatory for something to function properly. In those cases the API functions should offer, define, and honor such timing. In the absence of that, it is very foolish for the developer to write his code assuming any timing for a particular API function will remain a constant as tools and libraries are updated across releases.

In the larger picture, I don't believe that Arduino is any different when it comes to real-time timing needs from any other embedded environment.
i.e. eventually you may want to talk to hardware and that hardware may have critical signal timing that has to be met or not violated.

The difference is in Arduino there really are 2 types of needs for API capabilities:
- 1 for the Arduino application writers who are less technical
- 1 for the Arduino library writers that have real-time requirements
 and write a library to be called by the less technical users.

The two needs are quite different.

Quote

If Paul is correct then there should be only one digitalWrite and not several under bar variations.


For the most part, and in general, I agree with this.
However, there are times in a library where you have to have specific timing and even the newer much faster high level digital i/o functions like what paul is now using in his teensy libraries that replace the arduino digital pin i/o routines, may still not be quite good enough.
And that is where the additional (underbar) functions can really make the difference.

Also, keep in mind that to implement what paul is talking about,
(which is the same thing that I've been talking about - and paul and I have collaborated at times), function layering is required.
You have to have different layers of function calls (which have to have different names) that can be called depending on the type of argument (constant vs variable and direct i/o vs the additional handholding).
The function layer to eventually call is determined at compile time.

So whether you openly advertise the names of the sub functions that get called under the hood for doing the high level functions of handling timers, pwm, and direction bits, or the low level CBI/SBI instructions, you still have to have these subfunctions.

What I've been saying all along is that since you have to have these layered function calls anyway, go ahead and advertise them and each layers behavior.
Then use a common naming convention like tacking on additional (underbar) so that those that want or need the absolute fastest bit setting/clearing capability can use them and avoid any additional overhead.

This would allow those that want the existing API with all its handholding to get it, to get a  performance boost for using constants over the klunky existing ditital i/o routines.
And those that need the absolute fastest single instruction SBI/CBI code to set/clear bits have a known and documented way to do that for their real-time library needs.

--- bill
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Sep 06, 2010, 04:43 pm
There has been a little discussion on the developer's mailing list in the last few days about code size in the Arduino-19 release candidate as opposed to Arduino-18. I also was playing around with ODBuino recently and used this in the LCD code there to save space. It is interesting to compare this version of Blink
Code: [Select]
/* Blinking LED
* ------------
*
* turns on and off a light emitting diode(LED) connected to a digital  
* pin, in intervals of 2 seconds. Ideally we use pin 13 on the Arduino
* board because it has a resistor attached to it, needing only an LED

*
* Created 1 June 2005
* copyleft 2005 DojoDave <http://www.0j0.org>
* http://arduino.berlios.de
*
* based on an orginal by H. Barragan for the Wiring i/o board
*/

#define ledPin  13                 // LED connected to digital pin 13

void setup()
{
 pinMode(ledPin, OUTPUT);      // sets the digital pin as output
}

void loop()
{
 digitalWrite(ledPin, HIGH);   // sets the LED on
 delay(1000);                  // waits for a second
 digitalWrite(ledPin, LOW);    // sets the LED off
 delay(1000);                  // waits for a second
}

which takes 1392 bytes on Arduino-18 with this one:
Code: [Select]
#include <digitalWriteFast.h>
/* Blinking LED
* ------------
*
* turns on and off a light emitting diode(LED) connected to a digital  
* pin, in intervals of 2 seconds. Ideally we use pin 13 on the Arduino
* board because it has a resistor attached to it, needing only an LED

*
* Created 1 June 2005
* copyleft 2005 DojoDave <http://www.0j0.org>
* http://arduino.berlios.de
*
* based on an orginal by H. Barragan for the Wiring i/o board
*/

#define ledPin  13                 // LED connected to digital pin 13

void setup()
{
 pinModeFast(ledPin, OUTPUT);      // sets the digital pin as output
}

void loop()
{
 digitalWriteFast(ledPin, HIGH);   // sets the LED on
 delay(1000);                  // waits for a second
 digitalWriteFast(ledPin, LOW);    // sets the LED off
 delay(1000);                  // waits for a second
}

which takes 764 bytes.

Note that to get the savings I had to change ledPin from an int to a defined constant.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: westfw on Sep 07, 2010, 06:28 am
That would be a mega1280 based OBDuino, and a version of FastDigitalWrite() that DOESN'T turn off PWM, right?  The 1280 has a lot of PWM pins, and a lot of code in the digitalWrite() function to turn off PWM if it was on...
(on a 328, blink compiles to "only" 950 bytes with 19, 890 bytes with 18...)
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Sep 07, 2010, 12:06 pm
yes, I was using a Mega
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: e@ellenwang.com on Sep 08, 2010, 10:20 am
How about wrapping the macros in the standard do { ... } while (0) so the expansion will absorb a trailing semicolon?  Or better, use gcc inline functions.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Sep 08, 2010, 12:30 pm
earlier on this forum item, on 5/16, Paul said
Quote
I'd like to point out this macro approach is not my preferred coding style for this optimization.  I did it this way because David said he didn't like the verbose inline function approach I originally posted.  Indeed that way is much larger, but it has the advantage of not touching any pin when an illegal pin number is given as the input.  This macro, as currently implemented, will always write to some pin even if the pin number is too large.


This is only your second post and you seem to know a trick for getting rid of ';' in macros that I do not. I still feel very much like a newcomer to a lot of this stuff.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: e@ellenwang.com on Sep 08, 2010, 08:22 pm
jrraines wrote
Quote
earlier on this forum item, on 5/16, Paul said Quote:
Quote
I'd like to point out this macro approach is not my preferred coding style for this optimization.  I did it this way because David said he didn't like the verbose inline function approach I originally posted.  Indeed that way is much larger, but it has the advantage of not touching any pin when an illegal pin number is given as the input.  This macro, as currently implemented, will always write to some pin even if the pin number is too large.

This is only your second post and you seem to know a trick for getting rid of ';' in macros that I do not. I still feel very much like a newcomer to a lot of this stuff.

Well.  I've been writing C code for a long time, but only playing with the Arduino for a little bit.  And I cop to joining the thread late and not having read all the messages.

It surprises me that the inline function "is much larger".  I assume he means the generated code is bigger.  My experience with gcc is that inline functions are quite efficient, and the little test code I just tried confirmed that avr-gcc is not different.  Of course, digitalWriteFast.h is much more complex than my test.  Maybe I'll try to convert it to inline functions later.

Also, I think the macros that use if statements can also use the ?: conditional expression.  Is that also for better generated code?

Again, sorry if this has all been gone through before.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: Coding Badly on Sep 08, 2010, 09:50 pm
Quote
I assume he means the generated code is bigger.

Naw.  The source code is where the difference lies.  They both reduce to a single (or a few depending on the circumstances) machine instruction.  It's a matter of taste, maintenance, and functionality.  My preference is for inline-functions for one reason: I HATE debugging macro related bugs.

Quote
Of course, digitalWriteFast.h is much more complex than my test.  Maybe I'll try to convert it to inline functions later.

I suggest starting with Paul Stoffregen's work...
http://www.pjrc.com/teensy/teensyduino.html

Quote
Also, I think the macros that use if statements can also use the ?: conditional expression.

That's very likely true.  AlphaBeta uses conditional expressions frequently and he hasn't complained of any problems.

Quote
Is that also for better generated code?

I don't think there is any difference in the generated code.  "What's in an "if"? That which we call a condition.  By any other name would branch just the same."
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: westfw on Sep 09, 2010, 02:12 am
I didn't think you could meaningfully use __builtin_constant_p in inline functions?
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Sep 09, 2010, 02:15 am
the only place I used the ? trigrams was in digitalReadFast. I need it there because it yields a value I can assign to a variable. Bill Westfield had to explain that to me. see http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1266983837/1
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: e@ellenwang.com on Sep 09, 2010, 03:32 am
westfw wrote:
Quote
I didn't think you could meaningfully use __builtin_constant_p in inline functions?

That's a good point, but it turns out that it does work.  I was too lazy to look up the documentation to see whether it's defined behavior though.  I just tried this:

Code: [Select]
static int
g(int n)
{
  return n * 10;
}

static inline int
f(int n)
{
  if (__builtin_constant_p(n))
     return n + 10;
  return g(n);
}

int
main(int argc, char **argv)
{
  return f(10);
}


and got this (with avr-gcc -O -S):

Code: [Select]
...
main:
/* prologue: frame size=0 */
       ldi r28,lo8(__stack - 0)
       ldi r29,hi8(__stack - 0)
       out __SP_H__,r29
       out __SP_L__,r28
/* prologue end (size=4) */
       ldi r24,lo8(20)
       ldi r25,hi8(20)
/* epilogue: frame size=0 */
       rjmp exit
...


Not the most elegant test program but I was in a bit of a hurry.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: e@ellenwang.com on Sep 09, 2010, 09:28 am
jrraines wrote:
Quote
the only place I used the ? trigrams was in digitalReadFast. I need it there because it yields a value I can assign to a variable. Bill Westfield had to explain that to me. see http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1266983837/1

I see.

The advantage of using ?: in a macro instead of an if statement is that it keeps the expansion an expression.  Since the macro call looks like a function call, which is syntactically an expression, it's better when the expansion is also an expression.  This also means any trailing semicolon behaves correctly.

Sometimes, you need to cast one of the branches into a void, when the other branch is a function that returns void.  For example, in (c) ? f() : g(), if f() returns void but g() doesn't, then write it as (c) ? f() : (void) g().  (Yes, then the whole expression returns void and can only be used in a statement context, for side effects only, which is also the correct behavior.)

So, basically, ?: is better than an if statement in a macro, if it's possible.  Now, gcc also has the ({...}) syntax,, so pretty much any macro can be written with ?:.  On the other hand, why not just use an inline function, which doesn't evaluate arguments more than once?
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: westfw on Sep 09, 2010, 10:20 am
That's certainly MUCH prettier (and a lot easier to read) than the macro version!
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Sep 09, 2010, 12:09 pm
Several times I have thought I should put digitalWriteFast onto the Playground section of the site. When I have looked at doing this, I can't figure out what category it would go in. This is, of course, a reflection of Paul Stoffregen's point, that it belongs in the core, not in a library.

Nonetheless, it seems to me it should be on the Playground. I am open to suggestions as to what the category should be.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: westfw on Sep 10, 2010, 05:28 am
New category: "Advanced Optimizations, including alternative Core functionality."

There has been discussion of a lot of stuff in the forums that could go there, including (for example) hints on just how to go about DOING optimization...
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: westfw on Sep 11, 2010, 02:41 am
I did some experiments with the msp430 implementation the Arduino core I've been working on (with a digitalWrite() macro that uses __builtin_constant_p), and I found that using an inline function caused the results to be "overly and mysteriously dependent on the optimization flags given to the compiler."  I think I'll stick with the macros...
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Oct 21, 2010, 03:53 am
I made a very small change to support the Mega2560. I've tested it on the Mega1280 and the UNO. A 2560 is in the mail. I also included in the download versions of the test code for both the Uno and for the Mega. The version for the Mega actually runs through the tests 3 times in sequence for one time through loop(). It is special in that it demonstrates that the arduino gcc actually can handle program code that gets to a size that big.

i.e. Binary sketch size: 87296 bytes (of a 126976 byte maximum)

(Sketches that contain a huge amount of PROGMEM data and then code that gets stored higher up don't work at this time see: http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1274821710 )

The download is at http://code.google.com/p/digitalwritefast/downloads/list
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: mellis on Oct 21, 2010, 02:29 pm
Thanks for maintaining this, jrraines.  I'd like to include it in the Arduino software, soon, as a replacement for the current digitalRead() (at least and probably pinMode() and digitalRead(), too).  Any reason not to?  Or any reason why it should stay a separate function?
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Oct 21, 2010, 08:06 pm
I'm certainly in favor of adding it to the core.

There has been much discussion of pros and cons by people who are wiser than I am.

This just handles the case where the pin numbers are defined at compile time, of course. So you can't really get rid of the existing versions to handle cases like subroutines that get pin numbers passed in or loops with different pin numbers on successive passes through the code. But this is just a big macro so it will never make code longer or slow down runtime speed.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Oct 26, 2010, 03:09 am
I made a lot of progress Friday night and early Saturday morning and have had a confusing and frustrating time since.

I actually have/had something working to move this stuff into the core in wiring.h and wiring_digital.c  I emailed it to Paul Stoffregen hoping to get him to review it and haven't heard back.

I struggled all day Saturday trying to get a version of digitalWriteFast that would cooperate with both the current versions of wiring.h/wiring_digital.c and my modified ones.

My Mega2560 came and it does not work with what I posted a few days ago. It seems to me that I should just be able to use the same version of the test code as I use for the Mega1280, select Mega2560 from the boards menu and it should have just worked. But there are at least some errors in the test program on almost every pin pair. AND THE CODE SIZE FOR THE 2560 IS 20000 bytes smaller! I was very puzzled.

Here is the 1280 disassembly (which works):
Code: [Select]
analogWrite(2,254);
    2dc:      82 e0             ldi      r24, 0x02      ; 2
    2de:      6e ef             ldi      r22, 0xFE      ; 254
    2e0:      70 e0             ldi      r23, 0x00      ; 0
    2e2:      0e 94 14 a3       call      0x14628      ; 0x14628 <analogWrite>
pinModeFast(2,INPUT);
    2e6:      6c 98             cbi      0x0d, 4      ; 13
digitalWriteFast(2,HIGH);
    2e8:      80 91 90 00       lds      r24, 0x0090
    2ec:      8f 7d             andi      r24, 0xDF      ; 223
    2ee:      80 93 90 00       sts      0x0090, r24
    2f2:      74 9a             sbi      0x0e, 4      ; 14
pinModeFast(5,OUTPUT);
    2f4:      6b 9a             sbi      0x0d, 3      ; 13
digitalWriteFast(5,LOW);
    2f6:      80 91 90 00       lds      r24, 0x0090
    2fa:      8f 77             andi      r24, 0x7F      ; 127
    2fc:      80 93 90 00       sts      0x0090, r24
    300:      73 98             cbi      0x0e, 3      ; 14
delay(1);
    302:      61 e0             ldi      r22, 0x01      ; 1
    304:      70 e0             ldi      r23, 0x00      ; 0
    306:      80 e0             ldi      r24, 0x00      ; 0
    308:      90 e0             ldi      r25, 0x00      ; 0
    30a:      0e 94 59 a2       call      0x144b2      ; 0x144b2 <delay>
if((int) digitalReadFast(2) != LOW) error(2,5,1);
    30e:      80 91 90 00       lds      r24, 0x0090
    312:      8f 7d             andi      r24, 0xDF      ; 223
    314:      80 93 90 00       sts      0x0090, r24
    318:      64 9b             sbis      0x0c, 4      ; 12
    31a:      07 c0             rjmp      .+14           ; 0x32a <loop+0x5e>
    31c:      82 e0             ldi      r24, 0x02      ; 2
    31e:      90 e0             ldi      r25, 0x00      ; 0
    320:      65 e0             ldi      r22, 0x05      ; 5
    322:      70 e0             ldi      r23, 0x00      ; 0
    324:      41 e0             ldi      r20, 0x01      ; 1
    326:      50 e0             ldi      r21, 0x00      ; 0
    328:      9a df             rcall      .-204          ; 0x25e <_Z5erroriii>

analogWrite(5,254);
    32a:      85 e0             ldi      r24, 0x05      ; 5
    32c:      6e ef             ldi      r22, 0xFE      ; 254
    32e:      70 e0             ldi      r23, 0x00      ; 0
    330:      0e 94 14 a3       call      0x14628      ; 0x14628 <analogWrite>
pinModeFast(2,INPUT);
    334:      6c 98             cbi      0x0d, 4      ; 13
digitalWriteFast(2,HIGH);
    336:      80 91 90 00       lds      r24, 0x0090
    33a:      8f 7d             andi      r24, 0xDF      ; 223
    33c:      80 93 90 00       sts      0x0090, r24
    340:      74 9a             sbi      0x0e, 4      ; 14
pinModeFast(5,OUTPUT);
    342:      6b 9a             sbi      0x0d, 3      ; 13
digitalWriteFast(5,LOW);
    344:      80 91 90 00       lds      r24, 0x0090
    348:      8f 77             andi      r24, 0x7F      ; 127
    34a:      80 93 90 00       sts      0x0090, r24
    34e:      73 98             cbi      0x0e, 3      ; 14
delay(1);
    350:      61 e0             ldi      r22, 0x01      ; 1
    352:      70 e0             ldi      r23, 0x00      ; 0
    354:      80 e0             ldi      r24, 0x00      ; 0
    356:      90 e0             ldi      r25, 0x00      ; 0
    358:      0e 94 59 a2       call      0x144b2      ; 0x144b2 <delay>
if((int) digitalReadFast(2) != LOW) error(2,5,1);
    35c:      80 91 90 00       lds      r24, 0x0090
    360:      8f 7d             andi      r24, 0xDF      ; 223
    362:      80 93 90 00       sts      0x0090, r24
    366:      64 9b             sbis      0x0c, 4      ; 12
    368:      07 c0             rjmp      .+14           ; 0x378 <loop+0xac>
    36a:      82 e0             ldi      r24, 0x02      ; 2
    36c:      90 e0             ldi      r25, 0x00      ; 0
    36e:      65 e0             ldi      r22, 0x05      ; 5
    370:      70 e0             ldi      r23, 0x00      ; 0
    372:      41 e0             ldi      r20, 0x01      ; 1
    374:      50 e0             ldi      r21, 0x00      ; 0
    376:      73 df             rcall      .-282          ; 0x25e <_Z5erroriii>


here is corresponding 2560 disassembly from 2 of the cases that fail:
Code: [Select]
analogWrite(2,254);
    2e0:      82 e0             ldi      r24, 0x02      ; 2
    2e2:      6e ef             ldi      r22, 0xFE      ; 254
    2e4:      70 e0             ldi      r23, 0x00      ; 0
    2e6:      0e 94 c1 7b       call      0xf782      ; 0xf782 <analogWrite>
pinModeFast(2,INPUT);
    2ea:      52 98             cbi      0x0a, 2      ; 10
digitalWriteFast(2,HIGH);
    2ec:      5a 9a             sbi      0x0b, 2      ; 11
pinModeFast(5,OUTPUT);
    2ee:      55 9a             sbi      0x0a, 5      ; 10
digitalWriteFast(5,LOW);
    2f0:      84 b5             in      r24, 0x24      ; 36
    2f2:      8f 7d             andi      r24, 0xDF      ; 223
    2f4:      84 bd             out      0x24, r24      ; 36
    2f6:      5d 98             cbi      0x0b, 5      ; 11
delay(1);
    2f8:      61 e0             ldi      r22, 0x01      ; 1
    2fa:      70 e0             ldi      r23, 0x00      ; 0
    2fc:      80 e0             ldi      r24, 0x00      ; 0
    2fe:      90 e0             ldi      r25, 0x00      ; 0
    300:      0e 94 06 7b       call      0xf60c      ; 0xf60c <delay>
if((int) digitalReadFast(2) != LOW) error(2,5,1);
    304:      4a 9b             sbis      0x09, 2      ; 9
    306:      07 c0             rjmp      .+14           ; 0x316 <loop+0x46>
    308:      82 e0             ldi      r24, 0x02      ; 2
    30a:      90 e0             ldi      r25, 0x00      ; 0
    30c:      65 e0             ldi      r22, 0x05      ; 5
    30e:      70 e0             ldi      r23, 0x00      ; 0
    310:      41 e0             ldi      r20, 0x01      ; 1
    312:      50 e0             ldi      r21, 0x00      ; 0
    314:      a6 df             rcall      .-180          ; 0x262 <_Z5erroriii>

analogWrite(5,254);
    316:      85 e0             ldi      r24, 0x05      ; 5
    318:      6e ef             ldi      r22, 0xFE      ; 254
    31a:      70 e0             ldi      r23, 0x00      ; 0
    31c:      0e 94 c1 7b       call      0xf782      ; 0xf782 <analogWrite>
pinModeFast(2,INPUT);
    320:      52 98             cbi      0x0a, 2      ; 10
digitalWriteFast(2,HIGH);
    322:      5a 9a             sbi      0x0b, 2      ; 11
pinModeFast(5,OUTPUT);
    324:      55 9a             sbi      0x0a, 5      ; 10
digitalWriteFast(5,LOW);
    326:      84 b5             in      r24, 0x24      ; 36
    328:      8f 7d             andi      r24, 0xDF      ; 223
    32a:      84 bd             out      0x24, r24      ; 36
    32c:      5d 98             cbi      0x0b, 5      ; 11
delay(1);
    32e:      61 e0             ldi      r22, 0x01      ; 1
    330:      70 e0             ldi      r23, 0x00      ; 0
    332:      80 e0             ldi      r24, 0x00      ; 0
    334:      90 e0             ldi      r25, 0x00      ; 0
    336:      0e 94 06 7b       call      0xf60c      ; 0xf60c <delay>
if((int) digitalReadFast(2) != LOW) error(2,5,1);
    33a:      4a 9b             sbis      0x09, 2      ; 9
    33c:      07 c0             rjmp      .+14           ; 0x34c <loop+0x7c>
    33e:      82 e0             ldi      r24, 0x02      ; 2
    340:      90 e0             ldi      r25, 0x00      ; 0
    342:      65 e0             ldi      r22, 0x05      ; 5
    344:      70 e0             ldi      r23, 0x00      ; 0
    346:      41 e0             ldi      r20, 0x01      ; 1
    348:      50 e0             ldi      r21, 0x00      ; 0
    34a:      8b df             rcall      .-234          ; 0x262 <_Z5erroriii>



and here is disassembly for the uno (which has different pin/port correspondences than the 2560; this code also works):
Code: [Select]
analogWrite(2,254);
    194:      82 e0             ldi      r24, 0x02      ; 2
    196:      6e ef             ldi      r22, 0xFE      ; 254
    198:      70 e0             ldi      r23, 0x00      ; 0
    19a:      0e 94 87 14       call      0x290e      ; 0x290e <analogWrite>
pinModeFast(2,INPUT);
    19e:      52 98             cbi      0x0a, 2      ; 10
digitalWriteFast(2,HIGH);
    1a0:      5a 9a             sbi      0x0b, 2      ; 11
pinModeFast(5,OUTPUT);
    1a2:      55 9a             sbi      0x0a, 5      ; 10
digitalWriteFast(5,LOW);
    1a4:      84 b5             in      r24, 0x24      ; 36
    1a6:      8f 7d             andi      r24, 0xDF      ; 223
    1a8:      84 bd             out      0x24, r24      ; 36
    1aa:      5d 98             cbi      0x0b, 5      ; 11
delay(1);
    1ac:      61 e0             ldi      r22, 0x01      ; 1
    1ae:      70 e0             ldi      r23, 0x00      ; 0
    1b0:      80 e0             ldi      r24, 0x00      ; 0
    1b2:      90 e0             ldi      r25, 0x00      ; 0
    1b4:      0e 94 f3 13       call      0x27e6      ; 0x27e6 <delay>
if((int) digitalReadFast(2) != LOW) error(2,5,1);
    1b8:      4a 9b             sbis      0x09, 2      ; 9
    1ba:      07 c0             rjmp      .+14           ; 0x1ca <loop+0x46>
    1bc:      82 e0             ldi      r24, 0x02      ; 2
    1be:      90 e0             ldi      r25, 0x00      ; 0
    1c0:      65 e0             ldi      r22, 0x05      ; 5
    1c2:      70 e0             ldi      r23, 0x00      ; 0
    1c4:      41 e0             ldi      r20, 0x01      ; 1
    1c6:      50 e0             ldi      r21, 0x00      ; 0
    1c8:      a6 df             rcall      .-180          ; 0x116 <_Z5erroriii>

analogWrite(5,254);
    1ca:      85 e0             ldi      r24, 0x05      ; 5
    1cc:      6e ef             ldi      r22, 0xFE      ; 254
    1ce:      70 e0             ldi      r23, 0x00      ; 0
    1d0:      0e 94 87 14       call      0x290e      ; 0x290e <analogWrite>
pinModeFast(2,INPUT);
    1d4:      52 98             cbi      0x0a, 2      ; 10
digitalWriteFast(2,HIGH);
    1d6:      5a 9a             sbi      0x0b, 2      ; 11
pinModeFast(5,OUTPUT);
    1d8:      55 9a             sbi      0x0a, 5      ; 10
digitalWriteFast(5,LOW);
    1da:      84 b5             in      r24, 0x24      ; 36
    1dc:      8f 7d             andi      r24, 0xDF      ; 223
    1de:      84 bd             out      0x24, r24      ; 36
    1e0:      5d 98             cbi      0x0b, 5      ; 11
delay(1);
    1e2:      61 e0             ldi      r22, 0x01      ; 1
    1e4:      70 e0             ldi      r23, 0x00      ; 0
    1e6:      80 e0             ldi      r24, 0x00      ; 0
    1e8:      90 e0             ldi      r25, 0x00      ; 0
    1ea:      0e 94 f3 13       call      0x27e6      ; 0x27e6 <delay>
if((int) digitalReadFast(2) != LOW) error(2,5,1);
    1ee:      4a 9b             sbis      0x09, 2      ; 9
    1f0:      07 c0             rjmp      .+14           ; 0x200 <loop+0x7c>
    1f2:      82 e0             ldi      r24, 0x02      ; 2
    1f4:      90 e0             ldi      r25, 0x00      ; 0
    1f6:      65 e0             ldi      r22, 0x05      ; 5
    1f8:      70 e0             ldi      r23, 0x00      ; 0
    1fa:      41 e0             ldi      r20, 0x01      ; 1
    1fc:      50 e0             ldi      r21, 0x00      ; 0
    1fe:      8b df             rcall      .-234          ; 0x116 <_Z5erroriii>



So its pretty clear what's going on. Its picking the pin/port for a 328 Arduino. To be continued...
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Oct 26, 2010, 03:13 am
it seems like the problem has to be in the second line of code below but I cannot see it  (I have recently added the comments on the #endif statements, you can consider them suspect):
Code: [Select]
#if !defined(digitalPinToPortReg)
#if !( defined(__AVR_ATmega1280__) || defined(__AVR_ATmega2560__) )

// Standard Arduino Pins
#define digitalPinToPortReg(P) \
(((P) >= 0 && (P) <= 7) ? &PORTD : (((P) >= 8 && (P) <= 13) ? &PORTB : &PORTC))
#define digitalPinToDDRReg(P) \
(((P) >= 0 && (P) <= 7) ? &DDRD : (((P) >= 8 && (P) <= 13) ? &DDRB : &DDRC))
#define digitalPinToPINReg(P) \
(((P) >= 0 && (P) <= 7) ? &PIND : (((P) >= 8 && (P) <= 13) ? &PINB : &PINC))
#define digitalPinToBit(P) \
(((P) >= 0 && (P) <= 7) ? (P) : (((P) >= 8 && (P) <= 13) ? (P) - 8 : (P) - 14))

#if defined(__AVR_ATmega8__)
// 3 PWM
#define digitalPinToTimer(P) \
(((P) ==  9 || (P) == 10) ? &TCCR1A : (((P) == 11) ? &TCCR2 : 0))
#define digitalPinToTimerBit(P) \
(((P) ==  9) ? COM1A1 : (((P) == 10) ? COM1B1 : COM21))
#else  //168,328

// 6 PWM
#define digitalPinToTimer(P) \
(((P) ==  6 || (P) ==  5) ? &TCCR0A : \
       (((P) ==  9 || (P) == 10) ? &TCCR1A : \
       (((P) == 11 || (P) ==  3) ? &TCCR2A : 0)))
#define digitalPinToTimerBit(P) \
(((P) ==  6) ? COM0A1 : (((P) ==  5) ? COM0B1 : \
       (((P) ==  9) ? COM1A1 : (((P) == 10) ? COM1B1 : \
       (((P) == 11) ? COM2A1 : COM2B1)))))
#endif  //defined(__AVR_ATmega8__)

#else
// Arduino Mega Pins
#define digitalPinToPortReg(P) \
(((P) >= 22 && (P) <= 29) ? &PORTA : \
       ((((P) >= 10 && (P) <= 13) || ((P) >= 50 && (P) <= 53)) ? &PORTB : \
       (((P) >= 30 && (P) <= 37) ? &PORTC : \
       ((((P) >= 18 && (P) <= 21) || (P) == 38) ? &PORTD : \
       ((((P) >= 0 && (P) <= 3) || (P) == 5) ? &PORTE : \
       (((P) >= 54 && (P) <= 61) ? &PORTF : \
       ((((P) >= 39 && (P) <= 41) || (P) == 4) ? &PORTG : \
       ((((P) >= 6 && (P) <= 9) || (P) == 16 || (P) == 17) ? &PORTH : \
       (((P) == 14 || (P) == 15) ? &PORTJ : \
       (((P) >= 62 && (P) <= 69) ? &PORTK : &PORTL))))))))))

#define digitalPinToDDRReg(P) \
(((P) >= 22 && (P) <= 29) ? &DDRA : \
       ((((P) >= 10 && (P) <= 13) || ((P) >= 50 && (P) <= 53)) ? &DDRB : \
       (((P) >= 30 && (P) <= 37) ? &DDRC : \
       ((((P) >= 18 && (P) <= 21) || (P) == 38) ? &DDRD : \
       ((((P) >= 0 && (P) <= 3) || (P) == 5) ? &DDRE : \
       (((P) >= 54 && (P) <= 61) ? &DDRF : \
       ((((P) >= 39 && (P) <= 41) || (P) == 4) ? &DDRG : \
       ((((P) >= 6 && (P) <= 9) || (P) == 16 || (P) == 17) ? &DDRH : \
       (((P) == 14 || (P) == 15) ? &DDRJ : \
       (((P) >= 62 && (P) <= 69) ? &DDRK : &DDRL))))))))))

#define digitalPinToPINReg(P) \
(((P) >= 22 && (P) <= 29) ? &PINA : \
       ((((P) >= 10 && (P) <= 13) || ((P) >= 50 && (P) <= 53)) ? &PINB : \
       (((P) >= 30 && (P) <= 37) ? &PINC : \
       ((((P) >= 18 && (P) <= 21) || (P) == 38) ? &PIND : \
       ((((P) >= 0 && (P) <= 3) || (P) == 5) ? &PINE : \
       (((P) >= 54 && (P) <= 61) ? &PINF : \
       ((((P) >= 39 && (P) <= 41) || (P) == 4) ? &PING : \
       ((((P) >= 6 && (P) <= 9) || (P) == 16 || (P) == 17) ? &PINH : \
       (((P) == 14 || (P) == 15) ? &PINJ : \
       (((P) >= 62 && (P) <= 69) ? &PINK : &PINL))))))))))

#define digitalPinToBit(P) \
(((P) >=  7 && (P) <=  9) ? (P) - 3 : \
       (((P) >= 10 && (P) <= 13) ? (P) - 6 : \
       (((P) >= 22 && (P) <= 29) ? (P) - 22 : \
       (((P) >= 30 && (P) <= 37) ? 37 - (P) : \
       (((P) >= 39 && (P) <= 41) ? 41 - (P) : \
       (((P) >= 42 && (P) <= 49) ? 49 - (P) : \
       (((P) >= 50 && (P) <= 53) ? 53 - (P) : \
       (((P) >= 54 && (P) <= 61) ? (P) - 54 : \
       (((P) >= 62 && (P) <= 69) ? (P) - 62 : \
       (((P) == 0 || (P) == 15 || (P) == 17 || (P) == 21) ? 0 : \
       (((P) == 1 || (P) == 14 || (P) == 16 || (P) == 20) ? 1 : \
       (((P) == 19) ? 2 : \
       (((P) == 5 || (P) == 6 || (P) == 18) ? 3 : \
       (((P) == 2) ? 4 : \
       (((P) == 3 || (P) == 4) ? 5 : 7)))))))))))))))

// 15 PWM
#define digitalPinToTimer(P) \
(((P) == 13 || (P) ==  4) ? &TCCR0A : \
       (((P) == 11 || (P) == 12) ? &TCCR1A : \
       (((P) == 10 || (P) ==  9) ? &TCCR2A : \
       (((P) ==  5 || (P) ==  2 || (P) ==  3) ? &TCCR3A : \
       (((P) ==  6 || (P) ==  7 || (P) ==  8) ? &TCCR4A : \
       (((P) == 46 || (P) == 45 || (P) == 44) ? &TCCR5A : 0))))))
#define digitalPinToTimerBit(P) \
(((P) == 13) ? COM0A1 : (((P) ==  4) ? COM0B1 : \
       (((P) == 11) ? COM1A1 : (((P) == 12) ? COM1B1 : \
       (((P) == 10) ? COM2A1 : (((P) ==  9) ? COM2B1 : \
       (((P) ==  5) ? COM3A1 : (((P) ==  2) ? COM3B1 : (((P) ==  3) ? COM3C1 : \
       (((P) ==  6) ? COM4A1 : (((P) ==  7) ? COM4B1 : (((P) ==  8) ? COM4C1 : \
       (((P) == 46) ? COM5A1 : (((P) == 45) ? COM5B1 : COM5C1))))))))))))))

#endif  //mega
#endif  //#if !defined(digitalPinToPortReg)

#ifndef __digitalWrite
     #ifndef digitalWriteFast
           #define digitalWriteFast(P, V) \
           if (__builtin_constant_p(P) && __builtin_constant_p(V)) { \
                                   if (digitalPinToTimer(P)) \
                                               bitClear(*digitalPinToTimer(P), digitalPinToTimerBit(P)); \
                                   bitWrite(*digitalPinToPortReg(P), digitalPinToBit(P), (V)); \
                       } else { \
                                   digitalWrite((P), (V)); \
                       }
     #endif  //#ifndef digitalWriteFast

     #ifndef digitalWriteFast2
           #define digitalWriteFast2(P, V) \
           if (__builtin_constant_p(P) && __builtin_constant_p(V)) { \
           bitWrite(*digitalPinToPortReg(P), digitalPinToBit(P), (V)); \
           } else { \
           digitalWrite((P), (V)); \
           }
     #endif  //#ifndef digitalWriteFast2


     #else
           #define digitalWriteFast(  digitalWrite(
           #define digitalWriteFast2(  digitalWrite(
#endif //#ifndef __digitalWrite

#ifndef __pinMode
     #ifndef pinModeFast
           #define pinModeFast(P, V) \
           if (__builtin_constant_p(P) && __builtin_constant_p(V)) { \
                                   bitWrite(*digitalPinToDDRReg(P), digitalPinToBit(P), (V)); \
                       } else {  \
                                   pinMode((P), (V)); \
                       }
     #endif  //#ifndef pinModeFast

     #if !defined(pinModeFast2)
           #define pinModeFast2(P, V) \
           if (__builtin_constant_p(P) && __builtin_constant_p(V)) { \
           if (digitalPinToTimer(P)) \
           bitClear(*digitalPinToTimer(P), digitalPinToTimerBit(P)); \
           bitWrite(*digitalPinToDDRReg(P), digitalPinToBit(P), (V)); \
           } else {  \
           pinMode((P), (V)); \
           }
     #endif

     #else
           #define pinModeFast(  pinMode(
           #define pinModeFast2( pinMode(
#endif //#ifndef __pinMode

#ifndef __digitalRead
     #ifndef digitalReadFast
           #define digitalReadFast(P) ( (int) __digitalReadFast__((P)) )
           #define __digitalReadFast__(P ) \
           (__builtin_constant_p(P) ) ? ( \
                                   digitalPinToTimer(P) ? ( \
                                            bitClear(*digitalPinToTimer(P), digitalPinToTimerBit(P)) ,  \
                                                      bitRead(*digitalPinToPINReg(P), digitalPinToBit(P))) : \
                                     bitRead(*digitalPinToPINReg(P), digitalPinToBit(P)))  : \
                                   digitalRead((P))
     #endif   //#if !defined(digitalReadFast)

     #if !defined(digitalReadFast2)
           #define digitalReadFast2(P) ( (int) __digitalReadFast2__((P)) )
           #define __digitalReadFast2__(P ) \
           (__builtin_constant_p(P) ) ? ( \
           ( bitRead(*digitalPinToPINReg(P), digitalPinToBit(P))) ) : \
           digitalRead((P))
     #endif

     #else
           #define digitalReadFast( digitalRead(
           #define digitalReadFast2( digitalRead(
#endif  //#ifndef __digitalRead
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: e@ellenwang.com on Oct 26, 2010, 04:46 am
__AVR_ATmega2560__ is not defined?  That seems to be the only way for the mega256 code to be the same as the not-mega code.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Oct 26, 2010, 01:13 pm
That symbol is used throughout the core now, for instance in WProgram.h, pins_arduino. The !( ) are not used in the places I looked, though.

What I tried this morning:
I went back to Arduino-18, updated digitalWriteFast there. I'd already loaded Mark Sproul's 2560 bootloader etc. there. No difference from Arduino 21.

I added a few lines just above the part of the sketch I disassembled, mimicking to some extent the stuff that doesn't seem to work in digitalWriteFast (indeed I pasted the key line in from digitalWritefast.h):
Code: [Select]
#if !( defined(__AVR_ATmega2560__) ||defined(__AVR_ATmega1280__) )
Serial.println("This must be the UNO.");
#else
#if defined(__AVR_ATmega2560__)
Serial.println("This is the Mega2560.");
#endif
#if defined(__AVR_ATmega1280__)
Serial.println("This is the Mega1280.");
#endif
#endif  //#if !( defined(__AVR_ATmega2560__) ||defined(__AVR_ATmega1280__) )
//=============================the output from progprog.py goes below===================
analogWrite(2,254);
pinModeFast(2,INPUT);
digitalWriteFast(2,HIGH);
pinModeFast(5,OUTPUT);
digitalWriteFast(5,LOW);
delay(1);
if((int) digitalReadFast(2) != LOW) error(2,5,1);


What printed was correctly "This is the Mega2560."

I added a line at the top of my sketch
#define __AVR_ATmega2560__

that did not make a difference.

I changed that line to
#define __AVR_ATmega1280__
In the Boards menu I still selected the 2560.
now the test program runs on the 2560, reports no errors and is the appropriate size. This is, of course, not an actual solution.

I feel befuddled.

Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Oct 26, 2010, 01:53 pm
Is there some way an #undef could be snuck in between ?!?
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: mellis on Oct 26, 2010, 04:57 pm
Did you try:
Code: [Select]
#if !defined(__AVR_ATmega2560__) && !defined(__AVR_ATmega1280__)
?
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Oct 26, 2010, 11:36 pm
doing it with ! && ! didn't work either.

I then tried reversing the order of the code, too, so that the mega stuff came first. When I did that I pasted the line I think is at fault in directly from arduino_pins. Still didn't work.

I feel like I must be looking in the wrong place but I don't see where else it could be.
Code: [Select]
#if !defined(digitalPinToPortReg)
#if defined(__AVR_ATmega1280__) || defined(__AVR_ATmega2560__)
// Arduino Mega Pins
#define digitalPinToPortReg(P) \
(((P) >= 22 && (P) <= 29) ? &PORTA : \
((((P) >= 10 && (P) <= 13) || ((P) >= 50 && (P) <= 53)) ? &PORTB : \
(((P) >= 30 && (P) <= 37) ? &PORTC : \
((((P) >= 18 && (P) <= 21) || (P) == 38) ? &PORTD : \
((((P) >= 0 && (P) <= 3) || (P) == 5) ? &PORTE : \
(((P) >= 54 && (P) <= 61) ? &PORTF : \
((((P) >= 39 && (P) <= 41) || (P) == 4) ? &PORTG : \
((((P) >= 6 && (P) <= 9) || (P) == 16 || (P) == 17) ? &PORTH : \
(((P) == 14 || (P) == 15) ? &PORTJ : \
(((P) >= 62 && (P) <= 69) ? &PORTK : &PORTL))))))))))

#define digitalPinToDDRReg(P) \
(((P) >= 22 && (P) <= 29) ? &DDRA : \
((((P) >= 10 && (P) <= 13) || ((P) >= 50 && (P) <= 53)) ? &DDRB : \
(((P) >= 30 && (P) <= 37) ? &DDRC : \
((((P) >= 18 && (P) <= 21) || (P) == 38) ? &DDRD : \
((((P) >= 0 && (P) <= 3) || (P) == 5) ? &DDRE : \
(((P) >= 54 && (P) <= 61) ? &DDRF : \
((((P) >= 39 && (P) <= 41) || (P) == 4) ? &DDRG : \
((((P) >= 6 && (P) <= 9) || (P) == 16 || (P) == 17) ? &DDRH : \
(((P) == 14 || (P) == 15) ? &DDRJ : \
(((P) >= 62 && (P) <= 69) ? &DDRK : &DDRL))))))))))

#define digitalPinToPINReg(P) \
(((P) >= 22 && (P) <= 29) ? &PINA : \
((((P) >= 10 && (P) <= 13) || ((P) >= 50 && (P) <= 53)) ? &PINB : \
(((P) >= 30 && (P) <= 37) ? &PINC : \
((((P) >= 18 && (P) <= 21) || (P) == 38) ? &PIND : \
((((P) >= 0 && (P) <= 3) || (P) == 5) ? &PINE : \
(((P) >= 54 && (P) <= 61) ? &PINF : \
((((P) >= 39 && (P) <= 41) || (P) == 4) ? &PING : \
((((P) >= 6 && (P) <= 9) || (P) == 16 || (P) == 17) ? &PINH : \
(((P) == 14 || (P) == 15) ? &PINJ : \
(((P) >= 62 && (P) <= 69) ? &PINK : &PINL))))))))))

#define digitalPinToBit(P) \
(((P) >=  7 && (P) <=  9) ? (P) - 3 : \
(((P) >= 10 && (P) <= 13) ? (P) - 6 : \
(((P) >= 22 && (P) <= 29) ? (P) - 22 : \
(((P) >= 30 && (P) <= 37) ? 37 - (P) : \
(((P) >= 39 && (P) <= 41) ? 41 - (P) : \
(((P) >= 42 && (P) <= 49) ? 49 - (P) : \
(((P) >= 50 && (P) <= 53) ? 53 - (P) : \
(((P) >= 54 && (P) <= 61) ? (P) - 54 : \
(((P) >= 62 && (P) <= 69) ? (P) - 62 : \
(((P) == 0 || (P) == 15 || (P) == 17 || (P) == 21) ? 0 : \
(((P) == 1 || (P) == 14 || (P) == 16 || (P) == 20) ? 1 : \
(((P) == 19) ? 2 : \
(((P) == 5 || (P) == 6 || (P) == 18) ? 3 : \
(((P) == 2) ? 4 : \
(((P) == 3 || (P) == 4) ? 5 : 7)))))))))))))))

// 15 PWM
#define digitalPinToTimer(P) \
(((P) == 13 || (P) ==  4) ? &TCCR0A : \
(((P) == 11 || (P) == 12) ? &TCCR1A : \
(((P) == 10 || (P) ==  9) ? &TCCR2A : \
(((P) ==  5 || (P) ==  2 || (P) ==  3) ? &TCCR3A : \
(((P) ==  6 || (P) ==  7 || (P) ==  8) ? &TCCR4A : \
(((P) == 46 || (P) == 45 || (P) == 44) ? &TCCR5A : 0))))))
#define digitalPinToTimerBit(P) \
(((P) == 13) ? COM0A1 : (((P) ==  4) ? COM0B1 : \
(((P) == 11) ? COM1A1 : (((P) == 12) ? COM1B1 : \
(((P) == 10) ? COM2A1 : (((P) ==  9) ? COM2B1 : \
(((P) ==  5) ? COM3A1 : (((P) ==  2) ? COM3B1 : (((P) ==  3) ? COM3C1 : \
(((P) ==  6) ? COM4A1 : (((P) ==  7) ? COM4B1 : (((P) ==  8) ? COM4C1 : \
(((P) == 46) ? COM5A1 : (((P) == 45) ? COM5B1 : COM5C1))))))))))))))

#else

// Standard Arduino Pins
#define digitalPinToPortReg(P) \
(((P) >= 0 && (P) <= 7) ? &PORTD : (((P) >= 8 && (P) <= 13) ? &PORTB : &PORTC))
#define digitalPinToDDRReg(P) \
(((P) >= 0 && (P) <= 7) ? &DDRD : (((P) >= 8 && (P) <= 13) ? &DDRB : &DDRC))
#define digitalPinToPINReg(P) \
(((P) >= 0 && (P) <= 7) ? &PIND : (((P) >= 8 && (P) <= 13) ? &PINB : &PINC))
#define digitalPinToBit(P) \
(((P) >= 0 && (P) <= 7) ? (P) : (((P) >= 8 && (P) <= 13) ? (P) - 8 : (P) - 14))

#if defined(__AVR_ATmega8__)
// 3 PWM
#define digitalPinToTimer(P) \
(((P) ==  9 || (P) == 10) ? &TCCR1A : (((P) == 11) ? &TCCR2 : 0))
#define digitalPinToTimerBit(P) \
(((P) ==  9) ? COM1A1 : (((P) == 10) ? COM1B1 : COM21))
#else  //168,328

// 6 PWM
#define digitalPinToTimer(P) \
(((P) ==  6 || (P) ==  5) ? &TCCR0A : \
       (((P) ==  9 || (P) == 10) ? &TCCR1A : \
       (((P) == 11 || (P) ==  3) ? &TCCR2A : 0)))
#define digitalPinToTimerBit(P) \
(((P) ==  6) ? COM0A1 : (((P) ==  5) ? COM0B1 : \
       (((P) ==  9) ? COM1A1 : (((P) == 10) ? COM1B1 : \
       (((P) == 11) ? COM2A1 : COM2B1)))))
#endif  //defined(__AVR_ATmega8__)


#endif  //mega
#endif  //#if !defined(digitalPinToPortReg)

#ifndef __digitalWrite
     #ifndef digitalWriteFast
           #define digitalWriteFast(P, V) \
           if (__builtin_constant_p(P) && __builtin_constant_p(V)) { \
                                   if (digitalPinToTimer(P)) \
                                               bitClear(*digitalPinToTimer(P), digitalPinToTimerBit(P)); \
                                   bitWrite(*digitalPinToPortReg(P), digitalPinToBit(P), (V)); \
                       } else { \
                                   digitalWrite((P), (V)); \
                       }
     #endif  //#ifndef digitalWriteFast

     #ifndef digitalWriteFast2
           #define digitalWriteFast2(P, V) \
           if (__builtin_constant_p(P) && __builtin_constant_p(V)) { \
           bitWrite(*digitalPinToPortReg(P), digitalPinToBit(P), (V)); \
           } else { \
           digitalWrite((P), (V)); \
           }
     #endif  //#ifndef digitalWriteFast2


     #else
           #define digitalWriteFast(  digitalWrite(
           #define digitalWriteFast2(  digitalWrite(
#endif //#ifndef __digitalWrite

#ifndef __pinMode
     #ifndef pinModeFast
           #define pinModeFast(P, V) \
           if (__builtin_constant_p(P) && __builtin_constant_p(V)) { \
                                   bitWrite(*digitalPinToDDRReg(P), digitalPinToBit(P), (V)); \
                       } else {  \
                                   pinMode((P), (V)); \
                       }
     #endif  //#ifndef pinModeFast

     #if !defined(pinModeFast2)
           #define pinModeFast2(P, V) \
           if (__builtin_constant_p(P) && __builtin_constant_p(V)) { \
           if (digitalPinToTimer(P)) \
           bitClear(*digitalPinToTimer(P), digitalPinToTimerBit(P)); \
           bitWrite(*digitalPinToDDRReg(P), digitalPinToBit(P), (V)); \
           } else {  \
           pinMode((P), (V)); \
           }
     #endif

     #else
           #define pinModeFast(  pinMode(
           #define pinModeFast2( pinMode(
#endif //#ifndef __pinMode

#ifndef __digitalRead
     #ifndef digitalReadFast
           #define digitalReadFast(P) ( (int) __digitalReadFast__((P)) )
           #define __digitalReadFast__(P ) \
           (__builtin_constant_p(P) ) ? ( \
                                   digitalPinToTimer(P) ? ( \
                                            bitClear(*digitalPinToTimer(P), digitalPinToTimerBit(P)) ,  \
                                                      bitRead(*digitalPinToPINReg(P), digitalPinToBit(P))) : \
                                     bitRead(*digitalPinToPINReg(P), digitalPinToBit(P)))  : \
                                   digitalRead((P))
     #endif   //#if !defined(digitalReadFast)

     #if !defined(digitalReadFast2)
           #define digitalReadFast2(P) ( (int) __digitalReadFast2__((P)) )
           #define __digitalReadFast2__(P ) \
           (__builtin_constant_p(P) ) ? ( \
           ( bitRead(*digitalPinToPINReg(P), digitalPinToBit(P))) ) : \
           digitalRead((P))
     #endif

     #else
           #define digitalReadFast( digitalRead(
           #define digitalReadFast2( digitalRead(
#endif  //#ifndef __digitalRead


its almost enough to make one 'superstitious'. Defnitely need to fully understand this one!

Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Oct 27, 2010, 04:20 am
Solved.

I had an extra copy of the old version of digitalWriteFast.h in another folder within the Library folder. That one was getting used instead of the one that I was modifying. All 3 boards work with the new digitalWriteFast now, when I use the current (arduino 21) wiring.h and wiring_digital.c

When I try to change to the version of those core files that I built a few days ago, digitalWriteFast does not cooperate with them. I will struggle with that at least a little longer before I repost code.

It isn't absolutely critical that I get the two things to work together but it would be nice.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Oct 28, 2010, 01:02 pm
Paul Stoffregen pointed out that these macros are not currently safe from issue #146 on the Megas. Conceptually what I have to add is something like
Code: [Select]
void inline atomicWrite( uint16_t address, unint8_t p, unint8_t v) {
 if (address < 0x20)     bitWrite(*address, __digitalPinToBit(p),v);
 else  {
   uint8_t register saveSreg = SREG;    
       cli;                                  
       bitWrite(*address, __digitalPinToBit(p), v);  
       SREG=saveSreg;
 }
}

#define digitalWrite(P, V) \
do {
 if (__builtin_constant_p(P) && __builtin_constant_p(V))   atomicWrite(digitalPinToPortReg(P),P,V);
   else  __digitalWrite((P), (V));         \
}while (0)


Last night the compiler and I couldn't agree on the exact syntax. the Xcode editor I was using wouldn't help balance the last few }'s.

Surprisingly, this morning the Arduino IDE editor agrees with how I thought the }'s balanced. I broke it up by splitting out atomicWrite this morning but it seems obvious a cast of some sort is missing and I won't test it now.

Handling issue 146 will further diverge performance on ports above F, which already required more code. It struck me overnight that this MIGHT be a reason for continued existence of digitalWriteFast beyond implementation of faster digitalWrite in the core. After all, the vast majority of code I write does not turn the interrupt on. Worth some thought.

It also strikes me that I should probably create test code for whether issue 146 is solved. Maybe I can just add Tone to my current test programs, or something like that. I need something that will generate interrupts 10-1000 times per second, I think. If anyone understands how to write that, or can point me to a forum item I would appreciate that.
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: bperrybap on Nov 02, 2010, 07:23 pm
Quote

Handling issue 146 will further diverge performance on ports above F, which already required more code. It struck me overnight that this MIGHT be a reason for continued existence of digitalWriteFast beyond implementation of faster digitalWrite in the core. After all, the vast majority of code I write does not turn the interrupt on. Worth some thought.

It also strikes me that I should probably create test code for whether issue 146 is solved. Maybe I can just add Tone to my current test programs, or something like that. I need something that will generate interrupts 10-1000 times per second, I think. If anyone understands how to write that, or can point me to a forum item I would appreciate that.


Ah... issue #146. (I was the one that asked Paul to post this bug)

There should be no need to write & run test code to see if this is resolved.
Either the foreground bit set/clear operations are atomic or they
aren't. A simple code inspection can determine this.
(Ideally, look at the assembler output, it will be obvious)

I recommend code inspection over test code because it is super easy and because with test code there is no guarantee that the exact timing scenario will ever exist to create a problem.

Keep in mind that ensuring that the i/o operation is atomic
is not about protecting bits set/cleared in registers by the users foreground (loop) code from being interrupted by interrupt code, it is about protecting the bits set/cleared by the other code done in interrupt routines from the users foreground (loop) code.

It does not matter if the user does not have any
code that runs as an interrupt. It will be his foreground (loop) code
that clobbers the bits set/cleared/used by the interrupt routine.
The users foreground (loop) code  i/o operations will always complete and work flawlessly with or without the atomic code masking

And so Ironically, the code that runs at interrupt level like the servo library code does not need to have the additional code overhead of masking/clearing interrupts to make the operation "atomic"  since the code is running at interrupt level, it is by definition atomic but
the foreground (loop) code running must have it, if they share the
same i/o register.

And that is the dilemma, sometimes you need the extra code overhead
to ensure the operation is atomic and sometimes you don't need
it to still have atomicity, and yet other times, you may not need atomicity at all.
The problem is the only time you can be ensured that you don't need
it, is if the code is part of an interrupt routine.

In my view,  ALL the i/o operations need to be atomic.
Anything else, doesn't really work, and eventually will cause
problems.
Yes there are situations where you can be guaranteed that the
additional code to mask/restore interrupt masks isn't needed but
those are only in interrupts routines which are typically library code
and usually not done by the normal/casual arduino user.

If it were desirable to support some thinner/faster i/o routines to be used
by ISR routines, then that is definitely possible
but again, this goes back to my original suggestion of using
layering and _  (underbar) functions/macros so that the code writer can pick and choose what level of support his code needs.

--- bill
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Nov 07, 2010, 01:45 am
I did finally post a significant update to this:
http://code.google.com/p/digitalwritefast/downloads/list

In the original version of this library I supplied 2 versions of each command with different behavior around turning off PWM (the analogWrite facility). This version does not turn off PWM--that feature is planned to be removed from future versions of the standard commands.

In this version I only supply 1 version of the commands

Much more importantly this version is interrupt safe.

There are several good descriptions of the need for interrupt safe digitalWrite and pinMode commands. I struck me that I have a disassembly listing of the commands that certainly can be used to describe the situation in a different, not necessarily better, way:

For Ports F and below, a single instruction can be used to set or clear an individual bit that corresponds to the digitalWrite or PinMode setting you want to change. It will look like this:

Code: [Select]
pinModeFast(61,OUTPUT);
   390a:      87 9a             sbi      0x10, 7  ; 16  set a bit of port to 1
digitalWriteFast(61,LOW);
   390c:      8f 98             cbi      0x11, 7  ; 17  clear a bit of port to 0

The cbi and sbi instructions are inherently interrupt safe.

For Ports above F, the microprocessor must read in all the 8 pins in a Port, modify and rewrite all 8 pins. It would look like this if (it were done in an interrupt unsafe fashion):

Code: [Select]
digitalWriteFast(64,HIGH);
   38fe:      80 91 08 01       lds      r24, 0x0108  ;read the 8 bits of port into register 24
   3902:      84 60             ori      r24, 0x04    ; set a bit in register 24 to 1
   3904:      80 93 08 01       sts      0x0108, r24  ; write register 24 back to the port


The cbi and sbi instructions cannot address the ports higher than F.

The problem arises when the port is modified by other means in the microsecond between it being read into the register, modified and rewritten. Most commonly this seems to come up with the Servo library, but  any code that has an interrupt routine that itself does a digitalWrite or pinMode might cause the issue.

What happens is that the 'lds      r24, 0x0108' would read the current information from the port. Then it might happen that an interrupt would divert the microprocessor to handle another task. Imagine that 0x108 is modified during processing of the interrupt. Then the microprocessor returns to your program, which proceeds with 'ori      r24, 0x04' setting the bit you wanted, and 'sts      0x0108, r24' destroying whatever the interrupt routine did. The interrupt could come between the ori and the sts as well, of course. The problems are notoriously hard to debug; the exact timing of the interrupt cannot be replicated and the bits of code that interact are likely to be located far from each other--perhaps even in a library, whose source code you have never read!

This version of the library correctly generates interrupt safe code that looks like this:
Code: [Select]
digitalWriteFast(64,HIGH);
   38fa:      9f b7             in      r25, 0x3f      ; 63  read the interrupt status
   38fc:      f8 94             cli                  ; turn off interrupts
   38fe:      80 91 08 01       lds      r24, 0x0108
   3902:       84 60             ori      r24, 0x04      ; 4
   3904:       80 93 08 01       sts      0x0108, r24
   3908:       9f bf             out      0x3f, r25      ; 63  restore the interrupt status



This is needed for digitalWriteFast and pinModeFast.  The standard digitalWrite was corrected to cover this situation in Arduino 19.  
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: cjands40 on Dec 08, 2010, 10:29 pm
Skimming through this thread, all of the examples presented have used only the constants HIGH and LOW as the values being written using digitalWriteFast.  This does correspond with the Reference documentation for digitalWrite as well as many of the example sketches but there are some examples (eg, BlinkWithoutDelay) and likely much existing code that use variables instead.  In fact, digitalWrite currently maps any non-zero value into a write of '1'.  I would like to see that behavior maintained.  How will the new "fast" routines handle these cases?
Title: Re: digitalWriteFast, digitalReadFast, pinModeFast etc
Post by: jrraines on Dec 09, 2010, 12:26 am
Code: [Select]
pinModeFast(2,INPUT);
    1a0:      52 98             cbi      0x0a, 2      ; 10
digitalWriteFast(2,200);
    1a2:      5a 9a             sbi      0x0b, 2      ; 11
pinModeFast(5,OUTPUT);
    1a4:      55 9a             sbi      0x0a, 5      ; 10
digitalWriteFast(5,LOW);


Code: [Select]
pinModeFast2(2,INPUT);
    1c6:      52 98             cbi      0x0a, 2      ; 10
digitalWriteFast2(2,HIGH);
    1c8:      5a 9a             sbi      0x0b, 2      ; 11
pinModeFast2(5,OUTPUT);
    1ca:      55 9a             sbi      0x0a, 5      ; 10
digitalWriteFast2(5,LOW);
    1cc:      5d 98             cbi      0x0b, 5      ; 11