Optimization

I have been working on a clock with cross-fading digits. I have stripped the sketch down to it's bare essentials and done everything I can think of to optimize the code. But I still don't get a fast enough refresh rate, there is very noticeable visible flickering of the digits as it cross-fades. I estimate my refresh rate is only around 15-20Hz.

  1. Rather than use the time.h library, I rolled my own really basic one. Reasoning for that is that the time.h library stores its value internally as the # of seconds, and when you query minutes, hours, etc. -- as you are constantly doing when refreshing the display -- it has to divide the # of seconds by 60, 3600, etc. My version caches the number of seconds, minutes, hours in individual bytes.
  2. I put (almost) all the code into the loop() routine to avoid the overhead of call stacks.
  3. I got rid of the timer interrupts and just let it run balls-to-the-wall, again to avoid the overhead of call stacks.
  4. I write directly to the ports. I grouped the bits together in order.
  5. I tried to simplify all my math wherever I could. No floating point, and alot of the math is bitwise, utilizing powers of 2.
  6. I eliminated Serial.print and all debug code.
  7. WITH just one Serial.print it takes about 3-4 ms per pass. At 16MHz, that works out to about 5000-6000 clock cycles per pass. I can't see where my code would take that many clock cycles!

So can anyone see where I screwed up? :astonished: Can I not see the forest for the trees? 8) Why does each pass take so long to execute? :0 How can I optimize this some more? :~

/////////////////////////////////////////////////////////////////////////////
// Digital Clock with cross-fading digits
// UnCopyright (u) 2014 Dr. Wizard
// Open Source GPL rules apply: Use at your own risk, no warranty, yada, yada
// Bugs? Comments? Suggestions? email wizard@drwiz.net
/////////////////////////////////////////////////////////////////////////////

// These 4 pins are the value of the digit
const byte digitA1 = 8;
const byte digitB2 = 9;
const byte digitC4 = 10;
const byte digitD8 = 11;
// These 3 pins are which digit
const byte digSelA1 = 4;
const byte digSelB2 = 5;
const byte digSelC4 = 6;
// This pin is the latch
const byte digClock = 7;

// Time structure, with milliseconds, and 12-hour format
struct tmt {
  uint16_t tm_msec;     // Milliseconds             0-999
  uint8_t  tm_sec;      // Seconds after the minute 0-59
  uint8_t  tm_min;      // Minutes after the hour   0-59
  uint8_t  tm_hour;     // Hours since Midnight     0-23
  uint8_t  tm_mday;     // Number of Days           0-255
  uint8_t  tm_12hour;   // Hours in 12-hour format  1-12
  bool     tm_pm;       // True for PM, After Noon  0-1 (false-true)
};

tmt displTimeF;         // Time FROM
tmt displTimeT;         // Time TO
struct tmt *whichTime;  // Which to display

byte curDigit = 0;      // Which digit is currently being refreshed
                        // 0 = 10ths of a second (10 LEDS in a circle, simulating a Dekatron Tube)
                        // 1 = Seconds % 10
                        // 2 = Tens of Seconds
                        // 3 = Minutes % 10
                        // 4 = Tens of Minutes
                        // 5 = Hours % 10
                        // 6 = Tens of Hours
                        // 7 = Colons
byte digiVal;           // Value of the digit being refreshed
byte digiSel;           // CurDigit shifted 4 bits, cuz upper 4 bits of port is used 
long TickNo;            // Used for crossfade, Milliseconds shifted by 10 bits (1024)
int pct;                // Percentage for crossfade, ranges from 0-1023
int micron;             // Psudo random number 0-1023
unsigned long curMillis; // Current millis()
unsigned long lastTick;  // millis() on the last pass
uint16_t amt;            // How many millis() since the last pass

void setup() {

  //Serial.begin(9600);
  
  // Set starting times
  displTimeF.tm_msec   = 200;
  displTimeF.tm_sec    = 56;
  displTimeF.tm_min    = 34;
  displTimeF.tm_hour   = 12;
  displTimeF.tm_12hour = 12;
  displTimeF.tm_pm     = true;

  // Note that the To: time is 600 ms ahead of the From: time
  // Fades last 600 ms, with the remaing 400 ms static
  displTimeT.tm_msec   = 800;
  displTimeT.tm_sec    = 56;
  displTimeT.tm_min    = 34;
  displTimeT.tm_hour   = 12;
  displTimeT.tm_12hour = 12;
  displTimeT.tm_pm     = true;

  SetLEDpinModes();
  lastTick = millis();

} // void setup()

void loop() {

  if (displTimeT.tm_msec >= 800) {
    // Display From: time, static
    whichTime = &displTimeF;
  } else if (displTimeT.tm_msec >= 600) {
    // Display To: time, static
    whichTime = &displTimeT;
  } else {
    // Fade

    // Percentage of how far done we are, 0-1023 = 0.0% to 100.0%
    // Rather than using floating point math, shift the bits
    TickNo = (displTimeT.tm_msec + 500) << 10;
    pct = TickNo/600;
    // Generate psudo random number 0-1023
    micron = micros() | 0x3ff;
    // compare that to our "Percent" of 0-1023
    if (pct > micron) {
      whichTime = &displTimeT;
    } else {
      whichTime = &displTimeF;
    }
  } // if (TickNo <= 0)

  // Binary IF tree is faster than Switch/Case or If/Elseif/Elseif/Elseif
  if (curDigit & 0x4) {
    if (curDigit & 0x2) {
      if (curDigit & 0x1) {
        //7
        digiVal = 16; // All 4 Colon Dots On 
      } else {
        //6
        digiVal = whichTime->tm_12hour / 10;
        if (digiVal == 0) digiVal = 10; // Creates a blank, only 0-9 will display
      }
    } else { 
      if (curDigit & 0x1) {
        //5
        digiVal = whichTime->tm_12hour % 10;
      } else {
        //4
        digiVal = whichTime->tm_min / 10;
      }
    }
  } else {
    if (curDigit & 0x2) {
      if (curDigit & 0x1) {
        //3
        digiVal = whichTime->tm_min % 10;
      } else {
        //2
        digiVal = whichTime->tm_sec / 10;
      }
    } else { 
      if (curDigit & 0x1) {
        //1
        digiVal = whichTime->tm_sec % 10;
      } else {
        //0
        digiVal = whichTime->tm_msec / 1000;
      }
    }
  } // End of Binary IF tree

  digiSel = curDigit << 4;         // Multiply by 16 to use upper 4 bits
  PORTD &= 0x7f;                   // Set pin 7 low again
  PORTB = PORTB & 0Xf0 | digiVal;  // Put the data (digit value) on pins 8-11
  PORTD = PORTD & 0x0f | digiSel;  // Put the address (which digit) on pins 4-6
  PORTD |= 0x80;                   // Set pin 7 high to latch data

  curDigit ++;                     // Increment digit counter
  curDigit &= 0x7;                 // % 8 throw away upper bits

  curMillis = millis();            // Store current value
  amt = curMillis - lastTick;      // Figure out how many have elapsed
  //Serial.println (amt);
  tick(displTimeF, amt);           // Increment the From Time
  tick(displTimeT, amt);           // And the To Time
  lastTick = curMillis;            // Remember it for next pass
  
} void loop();

void tick(struct tmt &theTmt, uint16_t milliseconds) {
  theTmt.tm_msec += milliseconds;
  if (theTmt.tm_msec > 999) {
    theTmt.tm_msec -= 1000;
    theTmt.tm_sec += 1;
    if (theTmt.tm_sec > 59) {
      theTmt.tm_sec -= 60;
      theTmt.tm_min += 1;
      if (theTmt.tm_min > 59) {
        theTmt.tm_min -= 60;
        theTmt.tm_hour += 1;
        if (theTmt.tm_hour > 23) {
          theTmt.tm_hour -= 24;
        } // hour rollover
        theTmt.tm_12hour = theTmt.tm_hour;
        if (theTmt.tm_12hour > 12) {
          theTmt.tm_12hour -= 12;
        } else if (theTmt.tm_12hour == 0) {
          theTmt.tm_12hour = 12;
        } // 12-hour rollover
        theTmt.tm_pm = (theTmt.tm_hour > 11);
      } // min rollover
    } // sec rollover
  } // msec rollover
} // void tick()

void SetLEDpinModes() {
  pinMode(digitA1, OUTPUT); // = 8;
  pinMode(digitB2, OUTPUT); // = 9;
  pinMode(digitC4, OUTPUT); // = 10;
  pinMode(digitD8, OUTPUT); // = 11;
  pinMode(digSelA1, OUTPUT); // = 4;
  pinMode(digSelB2, OUTPUT); // = 5;
  pinMode(digSelC4, OUTPUT); // = 6;
  pinMode(digClock, OUTPUT); // = 7
}

A note about the hardware, each digit is latched, using it's own 4543 bcd-to-7-segment chip. Thus 4 bits representing the value of a digit are placed on a bus going to all chips. Another 3 bits plus a latch signal are sent to a 74hc238 3-to-8 decoder chip which selects which 4543 chip will get the data. Similar to multiplexing but I can send the data out in bcd format instead of figuring out which segments to light, and the chip latches the value so I get 100% brightness on every digit of the display at all times. I will send you eagle schematics upon request.

What digits? where ? What do the serial prints have to do with the digits ?
Some information is missing somewhere .
comment out all the serial prints and see how that effects your refresh rate

It's a clock. It has a 6-digit 7-segment LED display (hh:mm:ss) connected with some latches.

I DID comment out the one and only Serial.print and the Serial.begin, and got a refresh rate of only about 20Hz. The Serial.print was only there to begin with to get an idea of how long it was taking for each pass.

You have a 2nd void loop called out after the } that closes out void loop:

}void loop();

Not sure it does anything, but it can be deleted. You only need 1.

Assuming your problem is control bandwidth, (an assumption you should question) you need to have a good measure of it. Count the number of times loop executes and output the count divided by elapsed time every few seconds. That way your output time does not impact your measurement so much. Or you could output the rate to your LEDs for testing. Once you have a good measement in place you can then look at doing some profiling to see where the time is spent.

You seem to have an error on line 160 - missing // I suspect.

Because you have all the code in loop() it is difficult to figure what it's doing. Can you provide a few sentences (in English, not program code) to explain how it works.

What exactly do you mean by cross-fading digits? How do you do that? How often do the digits need to be updated? Presumably only the seonds digits need to be changed frequently. Though I can see that it might be simpler to update them all at the same time.

If you need to update the seconds digits 20 times per second (or even 40) to get a smooth fade that should hardly tax the Arduino. I can't see any reason not to organize the code into comprehensible functions. I presume updating a digit involves outputting a byte (or two) with the appropriate data.

...R

What exactly do you mean by cross-fading digits? How do you do that?

His name is Dr. Wizard. It's Magic !

I am a little intrigued by this problem, and since I can't immediately see how the OP's code is intended to work I thought I would figure out how I would write a similar program. The following sketch works as far as isolating the 6 decimal digits into an array. It displays the results in the Serial Monitor and it seems to iterate through loop() over 11,000 times in each half second. Presumably it would do a lot more if it didn't have to send stuff to the Serial monitor.

The only work left to do is to format the digits suitably so they can be sent to the digit displays. I don't immediatly know how to do that but I suspect that there are plenty of iterations available for it.

// MyClock.ino by Robin2 - 05 May 2014


unsigned long prevSecondMillis = 0;
unsigned long prevDelayMillis = 0;
unsigned long delayMillis = 500;
unsigned long curMillis;
byte seconds;
byte minutes;
byte hours;
byte clockDigits[6]; // hhmmss

unsigned long loopCount = 0;

void setup() {
  Serial.begin(9600);
  Serial.println("Starting MyClock.ino");
}


void loop() {
  curMillis = millis();
  loopCount ++;
  updateSeconds();
  updateMinutes();
  updateHours();
  updateDigits();
  showOnMonitor();
  
}


void updateSeconds() {
  if (millis() - prevSecondMillis >= 1000) {
    seconds ++;
    prevSecondMillis += 1000;
  }
}

void updateMinutes() {
  if (seconds >= 60) {
    minutes ++;
    seconds = 0;
  }
}

void updateHours() {
  if (minutes >= 60) {
    minutes = 0;
    hours ++;
    if (hours > 24) {
      hours = 0;
    }
  }
}

void updateDigits() {
  // look at each value and set the appropriate digits

  clockDigits[0] = hours / 10;
  clockDigits[1] = hours % 10;
  clockDigits[2] = minutes / 10;
  clockDigits[3] = minutes % 10;
  clockDigits[4] = seconds / 10;
  clockDigits[5] = seconds % 10;

}

void showOnMonitor() {
  if (curMillis - prevDelayMillis > delayMillis) {
    Serial.print(hours);
    Serial.print(" : ");
    Serial.print(minutes);
    Serial.print(" : ");
    Serial.print(seconds);
    
    Serial.print("  ----   ");
    
    for (byte n = 0; n < 6; n++) {
      Serial.print(clockDigits[n]);
    }
    
    Serial.print("   ----   ");
    Serial.print(loopCount);
    
    Serial.println();
    
    prevDelayMillis += delayMillis;
  }
}

...R

Hi Dr Wizard

Can I check my understanding?

  if (displTimeT.tm_msec >= 800) {
    // Display From: time, static
    whichTime = &displTimeF;
  } else if (displTimeT.tm_msec >= 600) {
    // Display To: time, static
    whichTime = &displTimeT;
  } else {
    // Fade

For 200ms each second (when displTimeT.tm_msec is in the range 600 to 799), the "To" time is displayed. For 200ms, (displTimeT.tm_msec = 800 to 999), the "From" time is displayed. For the other 600ms, the display cross-fades between the two times. Have I got that right?

    // Percentage of how far done we are, 0-1023 = 0.0% to 100.0%
    // Rather than using floating point math, shift the bits
    TickNo = (displTimeT.tm_msec + 500) << 10;
    pct = TickNo/600;
    // Generate psudo random number 0-1023
    micron = micros() | 0x3ff;
    // compare that to our "Percent" of 0-1023
    if (pct > micron) {
      whichTime = &displTimeT;
    } else {
      whichTime = &displTimeF;
    }

Then, to create the cross-fade, you are switching the display between the "To" and "From" times as you go round loop(). Which time is displayed is determined by comparing pct with a pseudo-random number between 0 and 1023. It looks like you are increasing pct during the cross-fade period to make it increasingly likely that (pct > micron) is true and, therefore, the "To" time is displayed rather than the "From" time. Is that right?

If so, the algorithm seems weighted heavily towards the "To" time - which may be intentional. I worked through the equations like this. I don't often use bit shift, so am ready to be shot down :slight_smile:

To be fading, displTimeT.tm_msec must be between 0 and 599 (see above).
Adding 500 gives 500 to 1099.
Shifting left by 10 bits means multiplying by 1024, and gives 512,000 to 1,125,276.
Dividing by 600 gives 853 to 1875.

Since micron will be in the range 0 to 1023, the expression (pct > micron) will always be true after about 20% of the cross-fade period has passed. At the start of the fade, when pct is 853, there is only roughly a 15% chance of it being false and the "From" time being displayed.

Apologies if this is intentional, and it may not be causing the flickering, but it looked like it could be relevant.

Regards

Ray

This is like throwing fresh meat in the middle of a pack of dogs ....

Does it work without the Cross-fade? Can you make the clock function properly without cross-fading the digit?

raschemmel:
This is like throwing fresh meat in the middle of a pack of dogs ....

I'm surprised. You are usually one of the pack :slight_smile:

...R

Robin2:

raschemmel:
This is like throwing fresh meat in the middle of a pack of dogs ....

I'm surprised. You are usually one of the pack :slight_smile:

...R

+1 :wink:

It's very frustrating to get interested in something and then hear nothing from the OP. :frowning:

...R

My problem turned out to be just the opposite of what I thought it was. It's running too FAST, which means my pseudo random number generating using micros() isn't generating very random numbers.

At a couple different points in my debugging I unremarked the Serial.begin and a single Serial.print that output the number of millis() since the last pass. I was getting 3-4. I can no longer replicate that, it is now mostly 0's and if I check the micros() instead of the millis() I get typically 56-64 microseconds per pass which is far better than I had hoped for. I'm still puzzling over why I was getting the 3-4 millis since I have not made any significant changes to the code since those tests. I did find the extra void loop() at end of the real loop(). That was actually supposed to be a comment regarding what the } was the end of. That was not however, the cause of my problem, I think the compiler completely ignored it because it had no code in it, or was redundantly named.

So now I need a new pseudo-random number generator that doesn't tax the system, but I have several ideas on how I might do that. All the bitwise math and general un-readability of the code was an attempt to shave off clock cycles when I thought it was running too slow. Now that I know it is running efficiently, I will reorganize the code to be more readable and maintainable, and put the refresh code back into it's own subroutine called by a timer.

For those of you curious about the cross fading, it is a common feature on some of the nixie-tube clocks. I think the effect is rather cool looking. Instead of just switching from say 3:49, the digits fade to 3:50 over roughly 1/2 second using PWM, starting approx 1/4 second before the 'real' time change, and ending approx 1/4 second after. Check out this video: https://www.youtube.com/watch?v=GiO6P6gcmE4

Thanks everyone!

Oh, and HackScribble, your analysis was pretty much spot on, except there is not any weighting towards the TO time. The pct value ends up in the range of 0-1023 corresponding to 0-100%. The pseudo-random number generated from micros is also in the range of 0-1023. So as the percentage increases, so does the likelihood that the random number will be less than the percentage. But at only approx 60 micros per pass, the random numbers weren't near random enough.

Why use a random number at all? It seems like the effect should be non-random. All you have to do is shift the percent over time.

Is there any reason you need to run so quickly?

Hi DrWizard

Your analysis was pretty much spot on, except there is not any weighting towards the TO time. The pct value ends up in the range of 0-1023 corresponding to 0-100%.

I've taken the code you posted and run it using the debugger in Atmel Studio to look at the values of displTimeT.tm_msec, TickNo and pct after the following are executed.

    // Percentage of how far done we are, 0-1023 = 0.0% to 100.0%
    // Rather than using floating point math, shift the bits
    TickNo = (displTimeT.tm_msec + 500) << 10;
    pct = TickNo/600;

displTimeT.tm_msec was in the range 0 to 572, which is not far off 0 to 599.

TickNo was in the range 3072 to 60416, which made pct 5 to 100.

This surprised me, and then I wondered if pct was actually meant to be 0 to 100 percent. But that didn't fit with the comparison with micron.

Looking at the code, TickNo is declared as long. So I tried changing the constant "500" to be "500L" indicating a long.

After this, TickNo changed to be in the range 512,000 to 1,097,728. And pct is in the range 853 to 1829, which is close to what I was expecting when I first looked at the code.

Without forcing the long value, the "<< 10" looks to have resulted in an integer overflow. As a result, pct was not in the expected range and would not increase smoothly from its lower to higher limits - the remainder of the overflow is in effect a mod function and pct would, I think, jump about repeatedly through the range.

Changing the two lines of code to the following gives (in my tests) a range for pct from 0 to 972, which I think is what you are looking for.

	TickNo = (long)(displTimeT.tm_msec ) << 10;
	pct = TickNo/600;

Regards

Ray

@DrWizard, I would like to see a little video clip of your system working.

The attached code does something similar by showing 5 versions of the time in succession. Normally all 5 are the same so the time is constant. When the time changes the new time is written into one of the time arrays so it appears for 1/5th of the time. Then it is copied into the 2nd array so it is displayed for 2/5ths etc.

Because I don't have the BCD-7 segment chips I just wired the 7 seg display direct to PortA on a Mega. (None of the ports on a Uno allows you to use all 8 bits).

// MyClock.ino by Robin2 - 05 May 2014
// http://forum.arduino.cc/index.php?topic=237932.0
// The function writeDigitsWithFade() is designed for a Mega using PORTA to light the segments


unsigned long prevSecondMillis = 0;
unsigned long prevDelayMillis = 0;
unsigned long delayMillis = 1000;
unsigned long curMillis;
unsigned long prevBCDMillis = 0;
unsigned long BCDdelayMillis = 10;
byte seconds;
byte minutes;
byte hours;
byte clockDigits[6][6]= {
                          {0,0,0,0,0,0},
                          {0,0,0,0,0,0},
                          {0,0,0,0,0,0},
                          {0,0,0,0,0,0},
                          {0,0,0,0,0,0},
                          {0,0,0,0,0,0}
                         };
byte numFades = 0;
byte curFade = 0;

byte lcdDigits[10] = { B11111100,
                       B01100000,
                       B11011010,
                       B11110010,
                       B01100110,
                       B10110110,
                       B10111110,
                       B11100000,
                       B11111110,
                       B11110110
                      };


unsigned long loopCount = 0;
unsigned long prevLoopCount = 0;

void setup() {
  Serial.begin(9600);
  Serial.println("Starting MyClock.ino");
  
 for (byte n = 22; n < 30; n++) {
   pinMode(n, OUTPUT);
 }
  
}


void loop() {
  curMillis = millis();
  loopCount ++;
  updateSeconds();
  updateMinutes();
  updateHours();
  updateDigits();
  writeDigitsWithFade();
//  simulateWriteDigits();
  showOnMonitor();
  
}


void updateSeconds() {
  if (millis() - prevSecondMillis >= 1000) {
    seconds ++;
    prevSecondMillis += 1000;
    numFades = 1; // start a new fade sequence if the seconds change
                  //   hence old fade must be finished in less than 1 second
  }
}

void updateMinutes() {
  if (seconds >= 60) {
    minutes ++;
    seconds = 0;
  }
}

void updateHours() {
  if (minutes >= 60) {
    minutes = 0;
    hours ++;
    if (hours > 24) {
      hours = 0;
    }
  }
}

void updateDigits() {
  // look at each value and set the appropriate digits

  clockDigits[0][0] = hours / 10;
  clockDigits[0][1] = hours % 10;
  clockDigits[0][2] = minutes / 10;
  clockDigits[0][3] = minutes % 10;
  clockDigits[0][4] = seconds / 10;
  clockDigits[0][5] = seconds % 10;

}

void writeDigitsWithFade() {
  
  // there are 5 arrays that are displayed in turn
  // first, only one array has the new time
  // and after all 5 have been diplayed another is updated untill all 5 are the same
  
  if (curMillis - prevBCDMillis > BCDdelayMillis) {
    prevBCDMillis += BCDdelayMillis;

    byte digiVal = clockDigits[curFade][5]; // just the seconds digit as I have only one 7 seg display
    PORTA = ~ lcdDigits[digiVal];

    curFade ++;
    if (curFade == 5) {
      updateClockDigits();
      curFade = 0;
    }
  }
}


void updateClockDigits() {
  for (byte n = 0; n < 6; n++) {
      clockDigits[numFades][n] = clockDigits[numFades - 1][n];
  }

  numFades ++;
  if (numFades > 5) {
    numFades = 5;
  }
}

void showOnMonitor() {
  if (curMillis - prevDelayMillis > delayMillis) {
    Serial.print(hours);
    Serial.print(" : ");
    Serial.print(minutes);
    Serial.print(" : ");
    Serial.print(seconds);
    
    Serial.print("  ----   ");
    
    for (byte n = 0; n < 6; n++) {
      Serial.print(clockDigits[0][n]);
    }
    
    Serial.print("   ----   ");
    Serial.print(loopCount);
    Serial.print("  ");
    Serial.print(loopCount - prevLoopCount);
    prevLoopCount = loopCount;
    
    Serial.println();
    
    prevDelayMillis += delayMillis;
  }
}

...R

Well, here's my latest version. I've got the display refreshing at 1KHz. The fade time is 512ms or about 1/2 second, or 512 out of 1000 cycles of the refresh routine. I have pre-rendered a fade pattern in the fadeData array which takes it from 1/8 to 7/8 ratio of FromTime to ToTime. The refresh routine is pretty efficient running at under 100us per pass, or about 10% of the CPU time is spent on the display. Lots of debugging with Serial.prints indicates it seems to be working as I intended. And although the results are much improved, and the fade certainly seems smoother, there is still some perceptible flicker. I think the remaining flicker may be a persistence-of-vision effect from the eye and brain. And I think this is probably more pronounced with LEDs as opposed to nixie tubes, since I suspect that the ionizing neon gas in a nixie tube does not have the sharp cut-on / cut-off of a LED. (???)
I've still got some more ideas, including using a 'scope or logic analyzer to really verify my fade pattern, stretching the fade time to be most of the second, or shrinking it to 1/4 of a second, and maybe some different fade patterns. But for now, I am putting this project on the back burner to let it simmer in my brain for a while while I tinker with some other projects, including using a WTV020 sound module to make the clock talk.

Thanks again to everyone for their input and suggestions, and I still welcome any new suggestions!

FadingClock.ino (7.95 KB)

If the fade time is 512ms which is > 50% of the time, doesn't mean it is always fading, or worse, overrunning the last fade?