Isn't digitalWriteFast normally "as fast" as direct port manipulation? (My new version is
)
(Hmm. Perhaps not, if the bit value isn't also a constant.)
What concerns me the most is the time that while(counter<N) takes
while() usually compiles to optimal code, and shouldn't take anywhere near 300us. OTOH, Serial.print() can be quite slow. As others have said, we'd need to see more complete code to get an idea what is actually going on.
void WriteRL (int r15_1,int r14_1,int r13_1,int r12_1,int r11_1... **** 64 ARGUMENTS ****
OMG! Don't do THAT!
Passing arguments to a function is relatively slow, especially once you try to pass more arguments than will fit in the registers available on the CPU. Build up your table in an array, and either pass a pointer to the array or leave it as a global variable. (and, it looks like it could be "byte" instead of "int", which will save some time as well.)