Function() vs Speed

Which as I said would imply the parmeter delay is pushed onto the stack along with the program counter. The context of the application has to be stored somewhere before it jumps into the function.

To be fair, I used to believe that too. However compiler writers recently (Lua is an example) have worked out that careful tracking of registers is actually faster than blindly pushing things onto a stack. Conceptually, there is a stack. In practice, the compiler can replace pushing/popping with stuff like ("hey: I know I have 32 registers, why not just allocate 4 of them for the first 4 arguments that get sent to a function?").

Certainly there will come a time (eg. with 30 arguments) when the compiler just has to push them all. But you must admit, for functions that just take a few bytes of arguments, a smart compiler-writer will allocate a register or four for the initial arguments. This saves loading them into a register AND pushing then AND popping them. You just load them into a register. If they hadn't done this by now people would have complained.