Hi,
in a program to clear Nokia 5110 LCD display with 84x48 pixels this loop was used to clear the screen:
for(int i=0; i<504; i++) LcdWriteData(0x00);
On Arduino Uno the loop takes 87436μs, which translates to 11.44fps.
This is function LcdWriteData() used:
void LcdWriteData(byte dat)
{
digitalWrite(DC, HIGH); //DC pin is low for commands
digitalWrite(CE, LOW);
shiftOut(DIN, CLK, MSBFIRST, dat); //transmit serial data
digitalWrite(CE, HIGH);
}
Yesterday I did dive into the implementations of digitalWrite and shiftOut and explored quite some enhancement potential. Did a series of improvements from 11.44, to 23.74 to 206.27fps for just clear/fill screen and finally to 116.93fps for drawing a complete frame:
For posting here I did cleanup a superfluous comment and command from yesterday and change data type for bit iteration loop variable to uint8_t resulting in 121.01fps. This is the code between the microsecond measurements:
long t0=micros();
// for(int i=0; i<504; i++) LcdWriteData(0x00); // clear LCD
digitalWrite(DC, HIGH); //DC pin is low for commands
digitalWrite(CE, LOW);
uint8_t bitD = digitalPinToBitMask(DIN);
uint8_t bitC = digitalPinToBitMask(CLK);
volatile uint8_t *outD=portOutputRegister(digitalPinToPort(DIN));
volatile uint8_t *outC=portOutputRegister(digitalPinToPort(CLK));
for(int i=0; i<504; i++)
{
uint8_t val=frame[i];
for(uint8_t j=0; j<8; ++j, val>>=1)
{
// digitalWrite(DIN, !!(val & (1 << (7 - j))));
uint8_t oldSREG = SREG;
cli();
// DIN, now pin 7, is no PWM pin and valid
if ((val & 0x01)!=0)
*outD |= bitD;
else
*outD &= ~bitD;
SREG = oldSREG;
// digitalWrite(CLK, HIGH);
cli();
// CLK, pin 8, is no PWM pin and valid
*outC |= bitC;
SREG = oldSREG;
// digitalWrite(CLK, LOW);
cli();
*outC &= ~bitC;
SREG = oldSREG;
}
}
digitalWrite(CE, HIGH);
long t1=micros();
The technique I used is called loop hoisting and moves loop-invariant code out of the loops. In above code this was bitMask and outputRegister determination for CLK and DIN pins. In addition shiftOut was sped up a bit. And I made use of that I took pins that were no PWM pins (had to switch D9 to D7 for that, see added cable from D7 to D9 below) because that allows to skip the test on whether the pin is a PWM pin (and turn off its timer). Also I knew that I provide valid pin numbers and therefore could skip the test for valid pin number.
In general I would not do stuff like that and use digitalWrite() as is with all the built in checks.
But for time critical stuff like drawing an image I feel not guilty to do loop hoisting in self-made digitalWrite() given the big increase in fps (11.74->206.27 for clear/fill display).
Hermann.

