Go Down

Topic: Code size puzzlement (Read 1 time) previous topic - next topic

dougp

In an attempt to squeeze a few bytes out of my timer class (final posted version at post #9) I modified this logic test:

Code: [Select]

      /* line 130
        ----- condition the timer timing status
      */
// original test
     
            if ((_Enable or _Control) and !_Done and !_Reset) {
              _TimerRunning = true;
            }
            else _TimerRunning = false;

 // 'improved' test

//      _TimerRunning = (_Enable or _Control) and !_Done and !_Reset;



Surprisingly, the improved version takes more flash.  This seemed odd so I made up a small standalone sketch to test just this code.  Print statements were included to verify that the logic operated the same way for both. :

Code: [Select]

/*
   Comparison of two approaches to boolean logic
*/
bool y;
bool z;

void setup() {

  Serial.begin(57600);

  for (int i = 0; i < 16; i++) {
    bool cond0 = i & 1;
    bool cond1 = i & 2;
    bool cond2 = i & 4;
    bool cond3 = i & 8;

 if ((cond0 or cond1) and !cond2 and !cond3) {
          z = true;
        }
        else z = false;
    //
    // versus
    //
//    bool  y = (cond0 or cond1) and !cond2 and !cond3;

    Serial.print(y);
    Serial.print("  ");
    Serial.println(z);
  }
}
void loop() {
  // put your main code here, to run repeatedly:

}


My results from testing by commenting out sections of code:

    Class version         standalone
condition   size             size
no test       1880            1766
if/else        1904            1806
boolean     1918             1794
The if/else class version takes twelve fewer  bytes than the boolean version while the standalone if/else takes twelve more  bytes than the boolean version.  I believed that the class version will have a more complex addressing mechanism than  that used with the simple variables in the standalone but maybe it doesn't.  Anyway, all things being equal I at least expected the code size to change in the same direction for both.  I tested all the variations three times and got the same results so I'm pretty sure I'm seeing what I think I'm seeing.

Explanations?  Theories?  Something blindingly obvious I'm just missing?
So two neutrinos went into a bar.  Nothing happened.  They were just passing through.

DKWatson

Don't confuse size with speed. At times the compiler, if set to optimize for speed, will unroll loops (as an example) which creates faster but larger code. The only true way to compare is to compile both with -o0.
Live as if you were to die tomorrow. Learn as if you were to live forever. - Mahatma Gandhi

dougp

The only true way to compare is to compile both with -o0.
How and where is this done?  A search produced GCC command options but it's short on exactly how to use the options.  I use the downloaded IDE v. 1.8.2
So two neutrinos went into a bar.  Nothing happened.  They were just passing through.

DKWatson

You can modify the compiler parameters or use the optimize pragma,

#pragma GCC optimize ("-Ox")

where x is the level of optimization (s, 0, 1, 2 or 3 - 0 being none)
Live as if you were to die tomorrow. Learn as if you were to live forever. - Mahatma Gandhi

dougp

I went with the #pragma GCC optimize ("-o3") option and tested settings from 0-3.  I observed no difference in the numbers for either program.
So two neutrinos went into a bar.  Nothing happened.  They were just passing through.

oqibidipo

I went with the #pragma GCC optimize ("-o3") option
Uppercase or lowercase O?

dougp

The first test I tried was with -O0.  This gave error

 C:\Users\Owner\Documents\Arduino\PLCtimer_polymorphic/PLCtimer_polymorphic.ino:37: undefined reference to `vtable for PLCtimer'

 so the rest of the tests were with lowercase o.  I retested all with uppercase O, the results of which follow.


Code: [Select]

                 test pgm  class pgm
test     switch      size---size
-------------------------------------------
boolean     0        2240   err
if/else     0        2240   err
boolean     1        2102   2120
if/else     1        2102   2120
boolean     2        1840   2024
if/else     2        1856   2022
boolean     3        1840   2024
if/else     3        1856   2022
boolean no pragma    1794   1918
if/else     "        1806   1904


It looks like, in general, the optimizer increases code size, sometimes dramatically, sometimes not so much.  And, although the boolean test for the class still takes more memory overall when optimized it's only two bytes more versus fourteen with no optimization.


So two neutrinos went into a bar.  Nothing happened.  They were just passing through.

DKWatson

So, after all that, are we any wiser? Never lose sight of the fact that at the lowest level there are dozens of ways of accomplishing the same thing. The compiler chooses one. My experience until now is that GCC does a pretty good job. Still, if you really want to get your hands dirty (so to speak), find the object code, that's the file with the .o extension and run avr-objdump -d file_name.o>file_name.txt.

That will give you a text file listing of the compiled code, memory address, hex code and assembly code intermixed with the C code from your sketch. In this way you see exactly what's going on behind the green door. Then you have the (unenviable) ability to tweak the code and optimize all by your lonesome. Not necessarily recommended, but sometimes fun.
Live as if you were to die tomorrow. Learn as if you were to live forever. - Mahatma Gandhi

dougp

#8
Sep 25, 2018, 05:40 am Last Edit: Sep 25, 2018, 05:46 am by dougp
So, after all that, are we any wiser?
I'm probably past that.  But, I have seen that the compiler has bunches of options if you care to dig.  And, that it can be tweaked in myriad ways.  Thanks for that.

I attempted a text file dump through the command line but no joy.  See command line snip - I hope it's readable.  This is  done through the command line, right?



That will give you a text file listing of the compiled code, memory address, hex code and assembly code intermixed with the C code from your sketch.
I enjoy poking through that stuff even though I don't have all the mnemonics memorized or enough familiarity to make intelligent changes.

Hat tip to @oqibidipo for the upper/lower case prod.
So two neutrinos went into a bar.  Nothing happened.  They were just passing through.

DKWatson

I've re-written your test case a bit just to get rid of the Arduino stuff.
Code: [Select]
#include    <stdint.h>
#include    <string.h>
#include    <avr/io.h>
/*
Comparison of two approaches to boolean logic
*/
void usart0Init(uint32_t);
void prt(uint8_t);
//=======1=========2=========3=========4=========5=========6=========7=========8
//=======1=========2=========3=========4=========5=========6          usart0Init
//
void usart0Init(uint32_t brate)
{
    UCSR0A = 0 << TXC0 | 0 << U2X0 | 0 << MPCM0;
    UCSR0B = (1 << RXCIE0 ) | (1 << RXEN0 ) | (1 << TXEN0 );
    UCSR0C = (1 << UCSZ00 ) | (1 << UCSZ01 );
    UBRR0  = ((F_CPU / 16) / brate) - 1;
}
//=======1=========2=========3=========4=========5=========6=========7=========8
//=======1=========2=========3=========4=========5=========6                prt
//
inline void prt(uint8_t x) {while(!(UCSR0A & (1<<UDRE0)));UDR0 = x;}
//=======1=========2=========3=========4=========5=========6=========7=========8

bool y;
bool z;

int main()
{

    usart0Init(57600);

    for (int i = 0; i < 16; i++) {
        bool cond0 = i & 1;
        bool cond1 = i & 2;
        bool cond2 = i & 4;
        bool cond3 = i & 8;

        //if ((cond0 or cond1) and !cond2 and !cond3) {
        //    z = true;
        //}
        //else z = false;
        //
        // versus
        //
        y = (cond0 or cond1) and !cond2 and !cond3;

        prt(y + 48);
        prt(32);
        prt(z + 48);
        prt(10);
    }

    while(1) {
        // put your main code here, to run repeatedly:

    }
}
using the boolean, compiles to 256(262 outside from IDE) bytes. Changing to,
Code: [Select]
if ((cond0 or cond1) and !cond2 and !cond3) {
    z = true;
}
else z = false;

// versus
//
//y = (cond0 or cond1) and !cond2 and !cond3;
compiles to 264(234 outside from IDE) bytes.

main() for if test,
Code: [Select]
0000009e <main>:
  9e: 10 92 c0 00 sts 0x00C0, r1 ; 0x8000c0 <__TEXT_REGION_LENGTH__+0x7e00c0>
  a2: 88 e9        ldi r24, 0x98 ; 152
  a4: 80 93 c1 00 sts 0x00C1, r24 ; 0x8000c1 <__TEXT_REGION_LENGTH__+0x7e00c1>
  a8: 86 e0        ldi r24, 0x06 ; 6
  aa: 80 93 c2 00 sts 0x00C2, r24 ; 0x8000c2 <__TEXT_REGION_LENGTH__+0x7e00c2>
  ae: 80 e1        ldi r24, 0x10 ; 16
  b0: 90 e0        ldi r25, 0x00 ; 0
  b2: 90 93 c5 00 sts 0x00C5, r25 ; 0x8000c5 <__TEXT_REGION_LENGTH__+0x7e00c5>
  b6: 80 93 c4 00 sts 0x00C4, r24 ; 0x8000c4 <__TEXT_REGION_LENGTH__+0x7e00c4>
  ba: d0 e0        ldi r29, 0x00 ; 0
  bc: c0 e0        ldi r28, 0x00 ; 0
  be: 11 e0        ldi r17, 0x01 ; 1
  c0: ce 01        movw r24, r28
  c2: 83 70        andi r24, 0x03 ; 3
  c4: 99 27        eor r25, r25
  c6: 89 2b        or r24, r25
  c8: c9 f0        breq .+50      ; 0xfc <main+0x5e>
  ca: ce 01        movw r24, r28
  cc: 8c 70        andi r24, 0x0C ; 12
  ce: 99 27        eor r25, r25
  d0: 89 2b        or r24, r25
  d2: a1 f4        brne .+40      ; 0xfc <main+0x5e>
  d4: 10 93 00 01 sts 0x0100, r17 ; 0x800100 <_edata>
  d8: 80 e0        ldi r24, 0x00 ; 0
  da: 0e 94 48 00 call 0x90 ; 0x90 <_Z3prth>
  de: 80 e2        ldi r24, 0x20 ; 32
  e0: 0e 94 48 00 call 0x90 ; 0x90 <_Z3prth>
  e4: 80 91 00 01 lds r24, 0x0100 ; 0x800100 <_edata>
  e8: 0e 94 48 00 call 0x90 ; 0x90 <_Z3prth>
  ec: 8a e0        ldi r24, 0x0A ; 10
  ee: 0e 94 48 00 call 0x90 ; 0x90 <_Z3prth>
  f2: 21 96        adiw r28, 0x01 ; 1
  f4: c0 31        cpi r28, 0x10 ; 16
  f6: d1 05        cpc r29, r1
  f8: 19 f7        brne .-58      ; 0xc0 <main+0x22>
  fa: ff cf        rjmp .-2      ; 0xfa <main+0x5c>
  fc: 10 92 00 01 sts 0x0100, r1 ; 0x800100 <_edata>
 100: eb cf        rjmp .-42      ; 0xd8 <main+0x3a>


main() for bool test,
Code: [Select]
0000008e <main>:
  8e: 10 92 c0 00 sts 0x00C0, r1 ; 0x8000c0 <__TEXT_REGION_LENGTH__+0x7e00c0>
  92: 88 e9        ldi r24, 0x98 ; 152
  94: 80 93 c1 00 sts 0x00C1, r24 ; 0x8000c1 <__TEXT_REGION_LENGTH__+0x7e00c1>
  98: 86 e0        ldi r24, 0x06 ; 6
  9a: 80 93 c2 00 sts 0x00C2, r24 ; 0x8000c2 <__TEXT_REGION_LENGTH__+0x7e00c2>
  9e: 80 e1        ldi r24, 0x10 ; 16
  a0: 90 e0        ldi r25, 0x00 ; 0
  a2: 90 93 c5 00 sts 0x00C5, r25 ; 0x8000c5 <__TEXT_REGION_LENGTH__+0x7e00c5>
  a6: 80 93 c4 00 sts 0x00C4, r24 ; 0x8000c4 <__TEXT_REGION_LENGTH__+0x7e00c4>
  aa: d0 e0        ldi r29, 0x00 ; 0
  ac: c0 e0        ldi r28, 0x00 ; 0
  ae: ce 01        movw r24, r28
  b0: 83 70        andi r24, 0x03 ; 3
  b2: 99 27        eor r25, r25
  b4: 89 2b        or r24, r25
  b6: 31 f0        breq .+12      ; 0xc4 <main+0x36>
  b8: 81 e0        ldi r24, 0x01 ; 1
  ba: 9e 01        movw r18, r28
  bc: 2c 70        andi r18, 0x0C ; 12
  be: 33 27        eor r19, r19
  c0: 23 2b        or r18, r19
  c2: 09 f0        breq .+2      ; 0xc6 <main+0x38>
  c4: 80 e0        ldi r24, 0x00 ; 0
  c6: 0e 94 40 00 call 0x80 ; 0x80 <_Z3prth>
  ca: 80 e2        ldi r24, 0x20 ; 32
  cc: 0e 94 40 00 call 0x80 ; 0x80 <_Z3prth>
  d0: 80 e0        ldi r24, 0x00 ; 0
  d2: 0e 94 40 00 call 0x80 ; 0x80 <_Z3prth>
  d6: 8a e0        ldi r24, 0x0A ; 10
  d8: 0e 94 40 00 call 0x80 ; 0x80 <_Z3prth>
  dc: 21 96        adiw r28, 0x01 ; 1
  de: c0 31        cpi r28, 0x10 ; 16
  e0: d1 05        cpc r29, r1
  e2: 29 f7        brne .-54      ; 0xae <main+0x20>
  e4: ff cf        rjmp .-2      ; 0xe4 <main+0x56>
Live as if you were to die tomorrow. Learn as if you were to live forever. - Mahatma Gandhi

DKWatson

Complete listings of the two options.
Live as if you were to die tomorrow. Learn as if you were to live forever. - Mahatma Gandhi

westfw

Quote
Code: [Select]
  public:
    unsigned long _Accumulator = 0;
    byte _Reset: 1;
    bool _Enable: 1;
    byte _Done: 1;
    byte _OSRise: 1;
    bool _Control: 1;

Ok.  Why are some of those bytes and some bools, and what's with the ":1"?
On the off chance that the compiler will actually pack multiple bits into a single byte (does it?  It would be likely inside a struct, and unlikely as local variables in a function, but a "class" has a sort-of "in between" status), the boolean expression might decide is has to unpack them to get actual boolean values, while the if/else just has to do bit-tests (which are single instructions on AVR, under some circumstances.)


Do you have a working sample of the class-based code that exhibits this behavior?  I can't compile your #9 code from the other thread because I don't have "bounce2"

dougp

Ok.  Why are some of those bytes and some bools, and what's with the ":1"?
All have been converted to bools since that code was posted.  It was my understanding that the ":n" forces the compiler (to the extent that the compiler can be 'forced' to do anything) to pack the designated variables into byte subunits of size n.

Do you have a working sample of the class-based code that exhibits this behavior?  I can't compile your #9 code from the other thread because I don't have "bounce2"
An abbreviated version with bounce commented out.  Only the watchdog timer class is included in order to fit into the forum's 9k character limit.

Code: [Select]

/*
    Basic timer converted to polymorphic/inherited form.
    Utilizes virtual function to effect timer reset.

    reference Arduino.cc thread http://forum.arduino.cc/index.php?topic=567010.new#new
*/

// debounce disabled for users without the library
//#include <Bounce2.h>

#define BUTTON_PIN 6   // to enable the timer DIO6
#define RESET_PIN 8    // to reset a timer DIO8
#define externalLED1 7 //  +5--/\/\/-->|--DIO7

byte LED_PIN = 13;    // on board LED

// Instantiate a Bounce object
//Bounce debouncer = Bounce();

// create a class called 'PLCtimer'

class PLCtimer {
    /*
       This is the base class - All types:
       > produce a positive-going one-scan pulse 'os' upon reaching
       preset value.
       > respond to a reset command by setting accumulated value to
       zero and resetting done and tt.
       > set the done bit upon reaching preset.
    */

  public:
    // constructor - permits setting of variables upon instantiation
    // Timers are instantiated with enable false by default

    PLCtimer::PLCtimer(unsigned long pre, boolean en = 0)
    {
      _Preset = pre;
      negativePresetWarning(pre); // preset validation
    }; // end constructor
    /*
       User-added error handler from pert @ Arduino.cc, post #13
       thread http://forum.arduino.cc/index.php?topic=559655.new#new
       Timers may not have a negative preset value.
    */
    void negativeError( void ) __attribute__((error("Negative PLCtimer preset! Check instantiations!")));
    void negativePresetWarning(unsigned long number) {
      if (number < 0) {
        negativeError();
      }
    }
    /*
      ===== Access functions for timer status/controls
    */
    // Allows to start/stop a timer
    void setEN(bool en) {
      _Enable = en;
    }
    // Allows to reset a timer
    void setres(bool res) {
      _Reset = res;
    }
    // Returns enable state of timer
    byte getEN() {
      return _Enable;
    }
    // Returns reset state of timer
    bool getres() {
      return _Reset;
    }
    // Returns done status of timer
    bool getDn() {
      return _Done;
    }
    // Returns timer timing state of timer
    bool getTt() {
      return _TimerRunning;
    }
    // Returns timer timing state of timer
    bool getIntv() {
      return _TimerRunning;
    }
    // Returns state of timer done rising one-shot
    bool getOSRise() {
      return _OSRise;
    }
    // Returns state of timer done falling one-shot
    bool getOSFall() {
      return _OSFall;
    }
  private:
    /*
       Virtual timer Reset function
       Reset conditions to be determined by descendants
    */
    virtual void Reset();

  public:

    //    Function to operate timers created under PLCtimer
    //    ----------------------------------------
    //    Update timer accumulated value & condition
    //    flags 'tt' (timer timing) and 'dn' (done) based
    //    on timer type.
    //    Function returns boolean status of done, 'dn'
    //    ===========================

    boolean update() {
      _CurrentMillis = millis(); // Get system clock ticks
      if (_Enable or _Control) { // timer is enabled to run
        _Accumulator = _Accumulator + _CurrentMillis - _LastMillis;
        if (_Accumulator >= _Preset) { // timer done?
          _Accumulator = _Preset; // Don't let accumulator run away
          _Done = true;
        }
      }
      _LastMillis = _CurrentMillis;

      Reset();  // Call virtual function to reset timer based
      //           on derived class' criteria.
      /*
        ----- Generate a positive going one-shot pulse on timer done f-t transition
      */
      _OSRise = (_Done and _OSRSetup);
      _OSRSetup = !_Done;
      /*
        ----- and another positive going one-shot pulse on timer done t-f transition
      */
      _OSFall = (!_Done and _OSFSetup);
      _OSFSetup = _Done;
      /*
        ----- condition the timer timing status
      */
      if ((_Enable or _Control) and !_Done and !_Reset) {
        _TimerRunning = true;
      }
      else _TimerRunning = false;
      return _Done;
    }; // end of base class update Timer operation

  public:
    unsigned long _Accumulator = 0;
    bool _Reset: 1;
    bool _Enable: 1;
    bool _Done: 1;
    bool _OSRise: 1;
    bool _Control: 1;

  private:
    unsigned long _Preset;
    unsigned long _CurrentMillis;
    unsigned long _LastMillis = 0;
    bool _TimerRunning: 1;
    bool _OSFall: 1;
    bool _OSRSetup: 1;
    bool _OSFSetup: 1;

}; // end of class PLCtimer

/*  Define various types of timers derived from PLCtimer.  The timers
    differ in functionality in the reset methods.
*/
//                    Watchdog timer
//---------------------------------------------------------------
// Runs when enable is true. A change of state on the 'control'
// input applies a momentary reset to the timer and restarts the
// timing cycle. Continuous cycling of the control input at a rate
// faster than the time delay will cause the done status flag to
// remain low indefinitely.
//----------------------------------------------------------------

class WatchDog_tmr : public PLCtimer
{
  public:
    WatchDog_tmr(unsigned long pre): PLCtimer(pre)
    {}
    virtual void Reset() {
      /*
        Generate a positive going one-shot pulse whenever control
        input undergoes a state change.
      */
      _WD_OSRise = (_Control and _WD_OSRSetup);
      _WD_OSRSetup = !_Control;
      _WD_OSFall = (!_Control and _WD_OSFSetup);
      _WD_OSFSetup = _Control;

      if (_WD_OSFall or _WD_OSRise) { // enable did a transition
        _Done = false;
        _Accumulator = 0;
      }
    }
  private:

    bool _WD_OSRise: 1;
    bool _WD_OSFall: 1;
    bool _WD_OSFSetup: 1;
    bool _WD_OSRSetup: 1;
}; // end of class watchdog timer

//---------------
// Instantiate timer object(s) - All timers inherit from PLCtimer
//---------------
//

WatchDog_tmr WDT01(900UL);
WatchDog_tmr WDT02(1900UL);
WatchDog_tmr WDT03(750UL);
WatchDog_tmr WDT04(750UL);
WatchDog_tmr WDT05(750UL);
WatchDog_tmr WDT06(750UL);
WatchDog_tmr WDT07(75000UL);
WatchDog_tmr WDT08(750UL);
WatchDog_tmr WDT09(9000UL);
WatchDog_tmr WDT10(9750UL);

void setup() {

  // Configure pushbuttons with an internal pull-up :
  pinMode(BUTTON_PIN, INPUT_PULLUP);
  pinMode(RESET_PIN, INPUT_PULLUP);

  // After setting up the button, setup the Bounce instance :
//  debouncer.attach(BUTTON_PIN);
//  debouncer.interval(5); // interval in ms

  //Setup the LEDs :
  pinMode(LED_PIN, OUTPUT);
  pinMode(externalLED1, OUTPUT);

  int counter;
  //
  //  Serial.begin(230400); // open serial monitor comms.
  WDT01.setEN(HIGH);  // enable watchdog timer to run
  WDT02.setEN(HIGH);  // enable watchdog timer to run
}

void loop() {
  static int count;
 
  //============= Watchdog timer test

  //   Enable set high in setup()

  WDT01.update();
  WDT01._Control = (digitalRead(BUTTON_PIN)); // activity on control input resets timer
  if (WDT01.getDn()) {
    digitalWrite(externalLED1, LOW);
  }
  else digitalWrite(externalLED1, HIGH);

/*
  //============= Watchdog timer no. 2 test

  //   Enable set high in setup()

  WDT02.update();
  WDT02._Control = (digitalRead(BUTTON_PIN)); // activity on control input resets timer
  if (WDT02.getDn()) {
    digitalWrite(externalLED1, LOW);
  }
  else digitalWrite(externalLED1, HIGH);
*/

} //end of loop


So two neutrinos went into a bar.  Nothing happened.  They were just passing through.

dougp

@DKWatson, thanks much for the listings.  It's a bit thick but from the little I've pieced out so far the compiler has mysterious ways!

I'm still not clear on how to generate a listing on my own.  Was the snip I posted legible?
So two neutrinos went into a bar.  Nothing happened.  They were just passing through.

DKWatson

I'm still not clear on how to generate a listing on my own.  Was the snip I posted legible?
Snip was fine.

To create the assembly listing from the IDE compile you have to first navigate to the .elf file. Without a huge headache, do this from the command line.

From within Windows (File Explorer) goto C:\Users\your_user_name\AppData\Local\Temp\arduino_build_xxxxxx

The tricky bit is that xxxxxx is a number that doesn't appear to related to anything so check the timestamp to see if it's just been created. Inside this directory you should see all the files that were created drom the most recent build, including the .elf file.

Make sure that there are NO FILES SELECTED and, while holding down <shift>, RIGHT click your mouse and you should see the option to 'Open command window here'. Click that to open a command window (ya think?). From the command line (and this assumes your system PATH includes the avr-gcc directories) type,

avr-objdump -S arduino_filename.ino.elf>file_name.asm.lst

This will disassemble the Executable and Linkable Format file and pipe it to file_name.asm.lst (which is an ASCII text file that you can name anything you want). The resulting listing embeds the Arduino code with the HEX/assembly out put so you can see the 1:1 translation, sometimes on a command by command basis.

If your Arduino executables are not in your system PATH, you should think about modifying the PATH, otherwise you have no option but to start avr_objdump with a complete directory link.
Live as if you were to die tomorrow. Learn as if you were to live forever. - Mahatma Gandhi

Go Up