digitalWriteFast, digitalReadFast, pinModeFast etc

Adding all these parallel API functions seems a bit silly to me.

As has been mentioned before in these types of discussions, why can't the API be layered?
Rather than go off and invent an entire new slew of functions?

In other words for something like a digitalWrite().
digitalWrite() is the top layer and works just like it does today with all the handholding which does cost performance.
It does the timer code stuff and then calls _digitalWrite()

_digitalWrite() drops the next level and eliminates the checks and gets a bit faster and is really fast (single instruction) for constant arguments. If arguments are not constants it calls __digitalWrite()

__digitalWrite() is the bottom layer which gets called when the arguments are not constants.

That way existing code works just as it does today except much faster when constants are used. Users that are more knowledgeable and don't need any handholding can call _digitalWrite() to pick up additional performance especially if the arguments are constants to get single instruction bit sets/clears.

With a combination of macros and/or inline functions you can get everything with no additional overhead. And it is very consistent, backward compatible and easy to document since each layer takes the same arguments.

So my question is why go off and create all these parallel functions, when a simple more traditional layered API gets you there as well?

--- bill