Pages: 1 [2]   Go Down
Author Topic: #define versus const and other software performance issues  (Read 5166 times)
0 Members and 1 Guest are viewing this topic.
Global Moderator
Dallas
Offline Offline
Shannon Member
*****
Karma: 208
Posts: 12932
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset


Actually, I believe there is no add-immediate instruction.  Yup.  There is an add-immediate-word but no add-immediate-byte.

When Atmel says the AVR processor is "RISC", they aren't kidding!
Logged

Global Moderator
Offline Offline
Brattain Member
*****
Karma: 485
Posts: 18807
Lua rocks!
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Well I suppose they don't need it, and subtracting is arguably more useful than adding.
Logged


Seattle, WA USA
Offline Offline
Brattain Member
*****
Karma: 614
Posts: 49386
Seattle, WA USA
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
If you're working with integers, you can also do bit shifting to divide by 10. If you do something weird like this, make sure you comment what you did.
Please explain how. Bit shifting one place divides by 2. Then, you still need to divide by 5. I can't figure how how to shift any direction to divide by 5.
Logged

Global Moderator
Offline Offline
Brattain Member
*****
Karma: 485
Posts: 18807
Lua rocks!
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

A bit of Googling and I found an answer for at least an approximation. And hey, isn't most stuff an approximation? This test sketch:

Code:
const boolean SHOW_RESULTS = false;
const unsigned long ITERATIONS = 50000;

void setup ()
{

  Serial.begin (115200);
  Serial.println ();

  unsigned long start;
 

  Serial.println ("By shifting:");
 
  start = millis ();

  for (unsigned long n = 0; n < ITERATIONS; n++)
  {
    volatile unsigned long n2, n4, n32, n64, n102, ans;

    n2 = n     << 1;    // n * 2
    n4 = n2    << 1;    // n * 4
    n32 = n4   << 3;    // n * 32
    n64 = n32  << 1;    // n * 64
    n102 = n64 + n32 + n4 + n2;  // n * 102
    ans = n102 >> 10;  // n * 102 / 1024

    if (SHOW_RESULTS)
    {
      Serial.print ("n = ");
      Serial.print (n, DEC);
      Serial.print (" n / 10 = ");
      Serial.println (ans, DEC);
    }

  } // end for loop

  Serial.print ("time taken = ");
  Serial.println (millis () - start, DEC);

  Serial.println ("By multiplying and then shifting:");
 
  start = millis ();

  for (unsigned long n = 0; n < ITERATIONS; n++)
  {
    volatile unsigned long ans;

    ans = (n * 102) >> 10;  // n * 102 / 1024

    if (SHOW_RESULTS)
    {
      Serial.print ("n = ");
      Serial.print (n, DEC);
      Serial.print (" n / 10 = ");
      Serial.println (ans, DEC);
    }

  } // end for loop

  Serial.print ("time taken = ");
  Serial.println (millis () - start, DEC);
 
  Serial.println ("By dividing:");
 
  start = millis ();

  for (unsigned long n = 0; n < ITERATIONS; n++)
  {
    volatile unsigned long ans;

    ans = n / 10;

    if (SHOW_RESULTS)
    {
      Serial.print ("n = ");
      Serial.print (n, DEC);
      Serial.print (" n / 10 = ");
      Serial.println (ans, DEC);
    }

  } // end for loop

  Serial.print ("time taken = ");
  Serial.println (millis () - start, DEC);

} // end setup

void loop() {}

As written, I get this:

Code:
By shifting:
time taken = 771
By multiplying and then shifting:
time taken = 294
By dividing:
time taken = 1956

So, the basic method is to multiply by 102 and then shift right 10 bits (ie. divide by 1024). This is an approximation in the sense that we want to multiply by 0.1 but are actually multiplying by 102/1024 (0.09961).

I tried three ways, the first was to multiply by 102 by doing assorted shifts, the second was to let the compiler do the multiplication, and the third to let the compiler do the division.

The figures show that the division was the slowest (1956 mS), my shifting idea second slowest (771 mS) and the multiply-then-shift fastest (294 mS). In fact the multiply-then-shift is over 6 times as fast, if you can stand the loss of precision.

Obviously this is only really useful if you know the numbers in advance. It's also interesting that my attempt to second-guess the compiler and do a "more efficient" multiply by 102 was actually quite a lot slower than what the compiler generated, left to its own devices.
Logged


Offline Offline
God Member
*****
Karma: 32
Posts: 507
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Your method would be faster if you hadn't made the intermediates n2, n4, n32, n64 and n102 unnecessarily volatile. You could make it slightly faster still with the chain 2,3,6,96,102.

The reason the multiply method is fastest is because the ATmega has a 2 cycle 8-bit multiply instruction built in. You really want the shift to be 8 or 16 bits as then the shifting can be done for free (the compiler just moves the bytes in the long down by 1 or 2 places). Try (n*6554)>>16
Logged


Global Moderator
Offline Offline
Brattain Member
*****
Karma: 485
Posts: 18807
Lua rocks!
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
Your method would be faster if you hadn't made the intermediates n2, n4, n32, n64 and n102 unnecessarily volatile

You are right. I had volatile there to convince the compiler to do the work at all, but making only ans volatile gave me this:

Code:
By shifting:
time taken = 497
By multiplying and then shifting:
time taken = 294
By dividing:
time taken = 1956



Quote
Try (n*6554)>>16

Good call. I got:

Code:
By multiplying and then shifting:
time taken = 75

Closer too. Rather than multiplying by 0.1 we are multiplying by 0.1000061.
Logged


Global Moderator
Dallas
Offline Offline
Shannon Member
*****
Karma: 208
Posts: 12932
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset


Which is about the same number of significant digits expected from float.  Orders of magnitude faster and just as accurate.  What's not to like!
Logged

Global Moderator
Netherlands
Offline Offline
Shannon Member
*****
Karma: 217
Posts: 13739
In theory there is no difference between theory and practice, however in practice there are many...
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

wrote a shift multiply finder few years back in C# - http://www.codeproject.com/KB/cs/FindMulShift.aspx -
Logged

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Pages: 1 [2]   Go Up
Jump to: