Pages: 1 [2]   Go Down
Author Topic: My top 5 wish list for improvements to the Arduino core library/IDE  (Read 3001 times)
0 Members and 1 Guest are viewing this topic.
United Kingdom
Offline Offline
Tesla Member
***
Karma: 224
Posts: 6613
Hofstadter's Law: It always takes longer than you expect, even when you take into account Hofstadter's Law.
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Hi Paul, thanks for your feedback.

Teensyduino already has #1 and #5 for, for 0022 and earlier, and also 1.0-rc1.

That's good to know.

The F() macro and flash string support appearing in 1.0 was contributed by me, with help and input from Mikal Hard and Brian Cook, both on the Arduino developers list and in private followups.  I realize the _P suffix is commonly used by avr-libc and other programs.  It was discussed, but there was strong consensus it wasn't Arduino's API style.  Had I gone with a _P suffix, this contribution never would have been accepted (it took many months as-is to convince the Arduino Team to merge it).

Thanks for your contribution! I'm happy with the F() macro. It has the advantage that selection of the correct overload is automatic. With print_p and println_p there would always be the risk of users calling them with non-PROGMEM strings, or calling print and println with PROGMEM strings. As long as users use F() instead of PSTR() to create PROGMEM strings, that risk is gone.

Something they did not accept, but I tried to contribute, was F() flash-string integration with the String class.  So you can use F() with XYZ.print(), but not with anything that takes the newer String objects.

I won't cry over this, because IMO anything that uses dynamic memory allocation other than during initialization is a bad idea on an embedded platform with limited RAM, because of the memory fragmentation and unpredicatable RAM usage it causes. I appreciate that String makes some operations a lot easier to code, but I consider that the risks of using it are too high.

#3 is on my serious wish-list.  I worked on some elf/drawf parsing code to extract the call tree and stack framing info from the debug data inside the .elf file.  It's quite tricky, and of course if recursion is detected you can't know worst-case stack use.  I may never get this written, but I really, really want to have this feature, since running out of RAM really sucks.

While it would be very nice to be able to predict the total RAM use including the stack, I'd settle for just knowing the amount of RAM used by static data and string literals, so that I can see how much is left over for the stack.

I'm also really interested in #2, but if I develop it for Teensyduino and Arduino proper never accepts it, I wonder how much use it would ever get?  The huge benefit I see is for libraries.  I was considering a table of perhaps 5 or 10 callbacks, which could be triggered at something like 100 Hz (so with 10, the ISR would never need to call more than one).  That way, if the user needs 1 or 2, and uses a couple libraries than each need 1 tick callback, everything would still work nicely.

Yes, I suppose that if it were implemented for Arduino, libraries would start using it, so it needs to be chainable. Rather than a fixed table, I would define a link structure holding a pointer to the callback function and a pointer to the next link, and require anything that wants a callback to provide one of these structures. However, a 100Hz callback frequency is too slow for some purposes, e.g. multiplexing more than two or three 7-segment displays without flicker, or polling rotary encoders (I have found that polling rotary encoders at 1ms intervals works well, but if the interval is increased to 4ms then they don't work if you twist them quickly).

I'm open to ideas on this, and there's a very good chance I'll implement it in Teensyduino at some point, but my hope is the Arduino developers can at least buy into the idea and commit to an outline of the API... so it doesn't end up being a contribution they'll never accept.  (while many things I've contributed are now part of Arduino, on average most of the stuff I try to contribute never gets used... eg, look at issue #140).

Anyway, my point of this lengthy rambling post is I am indeed working on some of this stuff for Teensyduino, and a couple have been available for over a year.  I generally do try to contribute my improvements back to Arduino when they're general to the hardware, but usually it takes a very long time for any contribution to be accepted into Arduino proper, and often they just sit unused indefinitely.

Yes, it's frustrating when good ideas don't get accepted and incorporated. Thanks for all your efforts!
Logged

Formal verification of safety-critical software, software development, and electronic design and prototyping. See http://www.eschertech.com. Please do not ask for unpaid help via PM, use the forum.

Switzerland
Offline Offline
Sr. Member
****
Karma: 6
Posts: 375
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Thanks to dc42 for starting an interesting discussion.  Is the F() macro similar to the one in Mikal Hart's "Flash" library?  And are there any other parts of this library which have been incorporated into the 1.0 Arduino core?
Logged

SF Bay Area (USA)
Offline Offline
Tesla Member
***
Karma: 132
Posts: 6746
Strongly opinionated, but not official!
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
1. Additional member functions print_P and println_P in the Print class, for printing PROGMEM strings
fixed, apparently.

Quote
2. Facility for hooking the tick interrupt
Expensive and rather dangerous.  The tick interrupt is already too long, and calling an external function would immediately add significant overhead when the ISR code has to save extra state not normally saved by function calls.  I like the idea of having a separate periodic function, but tying into the existing timer tick seems like a bad way to do it.

Quote
3. When compiling a sketch, as well as displaying the amount of Flash used, display the amount of RAM used by static data and string literals.
There are several google code issues opened with respect to this.
http://code.google.com/p/arduino/issues/detail?id=40
http://code.google.com/p/arduino/issues/detail?id=395
http://code.google.com/p/arduino/issues/detail?id=444
Apparently part of the issue is that the amount of AVAILABLE ram is unknown to the IDE, so for the beginner audience this would just be a meaningless number printed out.

Quote
4. Configurable delay in analogRead() after switching the multiplexer
Perhaps the existing code should just work?  Or is this one of those things that is heavily dependent on the impedance of the analog source?

Quote
5. Fix the bug that causes delayMicros(0) to delay for a very long time
I guess.  I'm not sure that coming up with 0 as a result of calculations is that much more common than coming up with a negative number, which would also cause an unexpectedly  long delay...
Logged

United Kingdom
Offline Offline
Tesla Member
***
Karma: 224
Posts: 6613
Hofstadter's Law: It always takes longer than you expect, even when you take into account Hofstadter's Law.
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Hi @westfw, thanks for your feedback.

Quote
2. Facility for hooking the tick interrupt
Expensive and rather dangerous.  The tick interrupt is already too long, and calling an external function would immediately add significant overhead when the ISR code has to save extra state not normally saved by function calls.  I like the idea of having a separate periodic function, but tying into the existing timer tick seems like a bad way to do it.

From what perspective is the tick interrupt already too long, and in what way is hooking it dangerous? Surely it depends on whether or not you are using e.g. delayMicros() to do some critical timing?

I can see the validity of your point about overhead to save extra state, assuming that the compiler currently saves only the registers actually used in the ISR. Would it not be possible to use the existing ISR initially, and change the tick interrupt vector when the sketch makes the call to request a callback on each tick? That way, the time taken by the tick interrupt would not increase from the present until someone wants to hook it.

Quote
3. When compiling a sketch, as well as displaying the amount of Flash used, display the amount of RAM used by static data and string literals.
Apparently part of the issue is that the amount of AVAILABLE ram is unknown to the IDE, so for the beginner audience this would just be a meaningless number printed out.

I can't see any excuse for not adding this information to the boards.txt file.

Quote
4. Configurable delay in analogRead() after switching the multiplexer
Perhaps the existing code should just work?  Or is this one of those things that is heavily dependent on the impedance of the analog source?

Yes, it is heavily dependent on the resistance of the analog source. With no delay, the voltage on the previous pin read starts affecting the result of analogRead() when the source resistance is above 10K. With 10us delay (compared to >100us conversion time), my measurements indicate that there is no interference even at 100K source resistance.

Quote
5. Fix the bug that causes delayMicros(0) to delay for a very long time
I guess.  I'm not sure that coming up with 0 as a result of calculations is that much more common than coming up with a negative number, which would also cause an unexpectedly  long delay...

Maybe not, but the behaviour when passing 0 violates the reasonable expectations of the user.
Logged

Formal verification of safety-critical software, software development, and electronic design and prototyping. See http://www.eschertech.com. Please do not ask for unpaid help via PM, use the forum.

SF Bay Area (USA)
Offline Offline
Tesla Member
***
Karma: 132
Posts: 6746
Strongly opinionated, but not official!
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
Would it not be possible to use the existing ISR initially, and change the tick interrupt vector
no, the vectors are all in flash and can't be changed at runtime.
Logged

0
Offline Offline
God Member
*****
Karma: 26
Posts: 608
Always making something...
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Yup, all code is in flash on AVR, so any callback would need to be a RAM-based pointer which is tested at runtime.  Any implementation would eat a bit more CPU time, even if the feature is never used, just for checking the pointer.

My 2 main implementation concerns are how to allow multiple callbacks, since this is likely to be used by libraries without the user being aware, and what to do if the callback uses too much CPU time (eg, about 16200 CPU cycles).

One possible way to deal with the CPU time issue would be to re-enable interrupts before calling.  A flag would need to be set, so if the timer0 interrupt happens again before it returns, no additional callbacks are made.

Enabling multiple functions needs to work.  A library like Ethernet could really benefit, so it could flush partial packets after a short timeout, rather than the horribly inefficient approach of placing every call to Serial.write() into its own packet.  But if a project uses Ethernet, which might use a timer callback without the user's knowledge, the user will still expect to be able to have their own callback, or use some other library which also requires a callback.

Maybe a linked list with malloc makes sense?  Or maybe a small fixed size table might work?

I know 1000 Hz (actually 976 Hz) rate would be nice for some applications.  But if the API offers one callback for every timer0 ISR, and if multiple callbacks can be registered, even if the functions do very little, such a scheme could quickly consume all available CPU time calling several every tick.  A slower callback rate, like 244 Hz or 97.6 Hz could still be very useful.

If an application or library really needs a high rate of timer interrupts, it's best path is to commandeer one of the other timers, rather than getting a CPU-hogging callback from timer0.  Several libraries do this today, and having looked at several of them, their needs are pretty specialized.  I do not believe a scheme for sharing timer0 will work well for applications which need good performance.

But it would be really useful for slower things, like implementing automatic timeouts, debouncing pushbuttons, etc.
Logged

United Kingdom
Offline Offline
Tesla Member
***
Karma: 224
Posts: 6613
Hofstadter's Law: It always takes longer than you expect, even when you take into account Hofstadter's Law.
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Hi Paul, thanks for your feedback.

Yup, all code is in flash on AVR, so any callback would need to be a RAM-based pointer which is tested at runtime

Yes, I should have realised that the interrupt vectors are in flash. So at the very least, it would be necessary to test whether a byte of RAM is zero or not before deciding whether or not to save the entire context.

Enabling multiple functions needs to work.  A library like Ethernet could really benefit, so it could flush partial packets after a short timeout, rather than the horribly inefficient approach of placing every call to Serial.write() into its own packet.  But if a project uses Ethernet, which might use a timer callback without the user's knowledge, the user will still expect to be able to have their own callback, or use some other library which also requires a callback.

Maybe a linked list with malloc makes sense?  Or maybe a small fixed size table might work?

I know 1000 Hz (actually 976 Hz) rate would be nice for some applications.  But if the API offers one callback for every timer0 ISR, and if multiple callbacks can be registered, even if the functions do very little, such a scheme could quickly consume all available CPU time calling several every tick.  A slower callback rate, like 244 Hz or 97.6 Hz could still be very useful.

I suggest the following:

Code:
typedef struct _tickCallback
{
  void (*callbackFunction)();
  unsigned int callbackInterval;
  struct tickCallback *next;
  unsigned int callbackTicksToGo;
} tickCallback;

void registerTickCallback(tickCallback*);  // caller must pre-populate callbackFunction and callbackInterval fields

Although this slightly complicates the API, I prefer this to using malloc because (a) it avoids the boundary tag overhead of malloc, and (b) the memory requirements are statically determinable, assuming the caller declares his tickCallback object as a static variable (see my item #3). I don't like the idea of a fixed size table because it is impossible to say what size is right (to small = can't register enough callbacks, too large = wasted RAM). The field callbackInterval is the interval (in ticks) required between callbacks, and callbackTicksToGo is used by the ISR to count them.

One possible way to deal with the CPU time issue would be to re-enable interrupts before calling.  A flag would need to be set, so if the timer0 interrupt happens again before it returns, no additional callbacks are made.

Sounds like a good idea to me. There could be a single 1-byte flag meaning "callbacks are registered AND we are not currently in a callback" to allow the ISR to decide whether to save the entire context and execute the callbacks, or just save minimal context and execute the 'short' ISR.
« Last Edit: September 30, 2011, 09:14:52 am by dc42 » Logged

Formal verification of safety-critical software, software development, and electronic design and prototyping. See http://www.eschertech.com. Please do not ask for unpaid help via PM, use the forum.

0
Offline Offline
God Member
*****
Karma: 26
Posts: 608
Always making something...
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

I like a lot of these ideas.

A simple Arduino-style API really is necessary.  If it's even moderately complex, there's zero chance it will ever be accepted as a contribution to Arduino.
Logged

Seattle, WA USA
Offline Offline
Brattain Member
*****
Karma: 610
Posts: 49073
Seattle, WA USA
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
From what perspective is the tick interrupt already too long, and in what way is hooking it dangerous? Surely it depends on whether or not you are using e.g. delayMicros() to do some critical timing?
I don't know about the current code being too long, but look at the interrupt handlers people post. Serial.print() calls (at 9600) baud. Calls to write data to SD cards. Recursive calls. All kinds of stuff that has no place in an ISR.

Having the Arduino make it easier to use interrupts is not really a good idea, in my opinion, unless the whole idea of fast, fast, fast! gets very strong emphasis in the documentation (and we can laugh at people that don't read it first).
Logged

SF Bay Area (USA)
Offline Offline
Tesla Member
***
Karma: 132
Posts: 6746
Strongly opinionated, but not official!
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

What PaulS said.

You pick a number to specify about how much of the CPU time you're willing to spend in ISRs.  Say, 1%.  That's 160 cycles in an ISR that happens every millisecond; between 80 an 160 instructions.  The current timer tick code is about 70 instructions (which implements less than 10 lines of C code.)  The added external call to a function adds about 15 more instructions (6 more register to save/restore, and 10 bytes worth of instructions to actually do the indirect call (and they're ALL at least two-cycle instructions.)

So at 1% ISR time, you have essentially no time for any user code.
You can increase the "budget" to 2%, and now you can have user code that is comparable in complexity to the existing ISR: increment a couple of global 32bit variables.
At a 5% ISR budget, you probably start to allow the sorts of functions that a moderately careful programmer probably expects to be able to put in a timer tick callback, but now you're introducing about 50 microseconds worth of random latency that can occur between any two statements in the main loop() code, which is pretty significant, and you have to start explaining to people why their sketches are acting sort of random.

I guess what it really boils down to is that providing the ability to tie into the timer tick makes it possible for ANYBODY to put code in the ISR path, and I don't TRUST most people  to do that well enough that it won't cause more problems than it solves.
Logged

United Kingdom
Offline Offline
Tesla Member
***
Karma: 224
Posts: 6613
Hofstadter's Law: It always takes longer than you expect, even when you take into account Hofstadter's Law.
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

@westfw, thanks for yor feedback.

At a 5% ISR budget, you probably start to allow the sorts of functions that a moderately careful programmer probably expects to be able to put in a timer tick callback, but now you're introducing about 50 microseconds worth of random latency that can occur between any two statements in the main loop() code, which is pretty significant, and you have to start explaining to people why their sketches are acting sort of random.

What makes you think that 50us or even 100us of random latency would affect most sketches and make them 'act sort of random'? As I said in my comment, it all depends on whether you are trying to do precise timings using delayMicros or similar - and if you are, you already have the (smaller) random latency of the existing tick ISR to contend with. Any code whose timing is that critical should be executed with interrupts disabled.

I guess what it really boils down to is that providing the ability to tie into the timer tick makes it possible for ANYBODY to put code in the ISR path, and I don't TRUST most people  to do that well enough that it won't cause more problems than it solves.

You might as well decide not to trust people to do direct port access, or to change the analog reference, or to write C code. Surely Arduino isn't about preventing people from doing things, it's about providing ways of doing things easily. As things stand, if I want a 1ms tick (perhaps to poll a rotary encoder), I have to set up a tick interrupt using timer 2. Having two 1ms tick interrupts going off uses more CPU time in ISRs than having one, because of the extra context save.
« Last Edit: October 02, 2011, 05:33:13 am by dc42 » Logged

Formal verification of safety-critical software, software development, and electronic design and prototyping. See http://www.eschertech.com. Please do not ask for unpaid help via PM, use the forum.

0
Offline Offline
God Member
*****
Karma: 26
Posts: 608
Always making something...
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

I agree, at least partially, yes, a periodic callback can easily enable a programmer to burn far too much CPU time.  But does that really mean it shouldn't be available at all?  If they really need to check something periodically, consider the alternatives.

If it's polled in loop(), together with lots of other stuff, the polling has tremendous jitter, and the worst case times can become really horrible if Serial.print() or some other function blocks.  The Bounce library is a great example, where failure to keep each loop() execution short enough could possibly miss a short duration button press entirely!

Given the choice between a novice using a library that silently burns 6% of the CPU time for highly reliable performance versus reliability hinging upon that novice user's ability to constrain the total worst-case time that loop() runs, I'd go for highly reliable easy-to-use libraries.

Today, that choice is already made in many libraries, which commandeer the timers for this sort of purpose.  Perhaps they use slightly less CPU time than they otherwise would, but the trade-off is interfering with PWM on certain pins, incompatibility with other code using those timers, and lack of portability between chips.

There are a lot of trade-offs.  I don't have any perfect answers that avoid all potential undesirable outcomes.  But I do believe a periodic interrupt-context callback can be useful, especially for library authors.  Just because it can't be done perfectly doesn't mean it isn't worth doing.  I believe, despite imperfections, it could offer a much better alternative to how some things are currently done.
Logged

SF Bay Area (USA)
Offline Offline
Tesla Member
***
Karma: 132
Posts: 6746
Strongly opinionated, but not official!
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
What makes you think that 50us or even 100us of random latency [is bad]
it all depends on whether you are trying to do precise timings
For example, if you start disabling interrupts for 50us (either in the ISR or in the loop code because your timing is critical), you start running into significant chances of losing characters on the serial port.  So maybe you enable interrupts during the callbacks, as Stroffregen suggested, and then you have increased possibility of stack overflows leading to (difficult to analyze) memory overflow bugs...
And I'm claiming that by the time you allow 50us ISRs, you start interfering with things that were less that "precise timings."
After all, we have people complaining about 30second/day inaccuracies in millis() (0.03% error)...

Quote
a novice using a library
What I'm really worried about is a novice using two or three libraries, each of which was sure they could use 4% of the CPU in the ISR callback, and then having other things go wrong.  In essence, that's the same problem that exists now with alternative implementations using timer2/etc.

I like your idea of having the ISR round-robin between callbacks that are called less often than every millisecond.  Part of my worry is that a 1ms ISR is really too frequent to make generally available.  A lot of system timer ticks are much longer than 1ms;
4 ms in cisco IOS (and there was a specific reason for making it that short, or it would probably be longer!), ~20ms in MSDOS, 60Hz on assorted mainframes...  Likewise, I'd be more enthusiastic about a general purpose timer/periodic callback that left the actual implementation to the core authors.

Quote
You might as well decide not to trust people to do direct port access, or to change the analog reference
And I don't.  Lots of existing libraries are essentially dangerous because they do this sort of thing, and  the people that use them  don't understand.  And I agree that it should be allowed anyway...

Quote
Surely Arduino isn't about preventing people from doing things
Arduino is (I think) somewhat about hiding the "dangerous bits" away from where "people" need to worry about them.
An ISR callback would be counter to that philosophy.  We used to say, back at work, that we gave our users a lot of rope, and if they happened to tangle it around their neck before tripping, well...  they should have understood that that was a possibility.  I don't think that that's quite the way the Arduino team thinks.   They don't like to implement parts in the actual Arduino APIs that they have to say "you CAN do this, but you need to be really really careful."

Quote
Having two 1ms tick interrupts going off uses more CPU time in ISRs than having one, because of the extra context save.
It actually really close.  21 instructions for the ISR context save/restore, and 15 additional to permit the callback...
Logged

0
Offline Offline
God Member
*****
Karma: 26
Posts: 608
Always making something...
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

You're right about the danger of keeping interrupts disabled too long.  If the callback is made from interrupt context, interrupts really need to be re-enabled.  Yes, that uses even more stack space, possibly risking an out-of-memory collision, but there really isn't much other option for interrupt context.

However, another option would be to avoid interrupt context calls completely!  That decision was made for the new serialEvent.  The huge advantage is the user doesn't need to use volatile variables and carefully disable interrupts or use some other design that assures atomic access to shared data structures.  Those are such difficult topics that normal context callbacks are pretty compelling, even if their latency is much worse.

To be useful, the callback would have to be made from a number of places that block, and maybe even some that don't if users build long-lived looping structures calling them.  From delay() would be the most obvious.  The blocking in Serial.write() when the buffer is full would be another obvious place.
Logged

SF Bay Area (USA)
Offline Offline
Tesla Member
***
Karma: 132
Posts: 6746
Strongly opinionated, but not official!
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Yeah, I sorta feel like allowing callbacks from the timer is a weak solution when what ought to be implemented is the beginnings of some kind of operating system kernel.  Or at least something that would generalize to also being implemented in/by a microkernel.  Implement a 1ms timer callback function and you are forever committed to a 1ms timer callback function.  Implement an asynchronous task with timed wakeup, and you have a lot more choices...  The serialEvent() implementation is a reasonable example;  I'm not crazy about the details of the current implementation,  but the details are subject to being changed, and I don't feel tied to something that is overly specific to particular hardware.
Logged

Pages: 1 [2]   Go Up
Jump to: