Go Down

Topic: Why does this method use more memory? (Read 481 times) previous topic - next topic

CopperDog

Why does using an array to set pin states use 14 bytes more memory than setting each pin individually ?


Is it the room needed to increment the index? It isn't the outPins array because commenting out the for loop while leaving the array initialized doesn't affect the memory requirement


Code: [Select]


// ASSIGNED PINS (AT-MEGA 328P)
#define redRelay 10  //output to red LED relay
#define blueRelay 9  //output to blue LED relay
#define SIREN 5     //output to siren amplifier input
#define redFBK 3    //output to red feeedback LED
#define yelFBK 8    //output to Yellow feedback LED
#define grnFBK 4    //output to green feedback LED
#define PIEZO 6     //output to Piezo circuit   

//create an array of output pins
const int outPIN_COUNT = 7;
int outPins[outPIN_COUNT] = {3, 4, 5, 6, 8, 9, 10}; //output pin array


void setup()


  //set pin states

  //this method uses 14 bytes more
 for (int index = 0; index < outPIN_COUNT; index++) {
    pinMode(outPins[index], OUTPUT);
  }

    //this method uses 14 bytes less memory
    pinMode(redRelay, OUTPUT); // to red LED relay
    pinMode(blueRelay, OUTPUT); // to blue LED relay
    pinMode(SIREN, OUTPUT); //to siren amplifier input
    pinMode(PIEZO, OUTPUT); //piezo
    pinMode(redFBK, OUTPUT); //feedback Red LED
    pinMode(yelFBK, OUTPUT);//feedback Yellow LED
    pinMode(grnFBK, OUTPUT);//feedback Green LED

Personal bytes are finite, save room for play.

AWOL

Pin numbers are eight bit values - why use an int array?

Why use RAM at all?
"Pete, it's a fool (who) looks for logic in the chambers of the human heart." Ulysses Everett McGill.
Do not send technical questions via personal messaging - they will be ignored.
I speak for myself, not Arduino.

DrAzzy

Flash or RAM?

The compiler may see that you never use the array and, determining that it could not have any effect on the program since you never access it, optimize it away.

The next step in the investigation, IMO, would be to use the array and do like

pinMode(outpins[0],OUTPUT);
...
pinMode(outpins[6],OUTPUT);

and see where that falls on memory usage.
ATtiny core for 841+1634+828 and x313/x4/x5/x61/x7/x8 series Board Manager:
http://drazzy.com/package_drazzy.com_index.json
ATtiny breakouts (some assembled), mosfets and awesome prototyping board in my store http://tindie.com/stores/DrAzzy

CopperDog

The intent of using an array is to allow fast and easy pin reassignments without changing a bunch of code in the sketch. The fact that it's an int array doesn't seem to matter to the memory requirement. I can take out array and the RAM required is the same. I'm curious as to why.
Flash or RAM?

The compiler may see that you never use the array and, determining that it could not have any effect on the program since you never access it, optimize it away.

The next step in the investigation, IMO, would be to use the array and do like

pinMode(outpins[0],OUTPUT);
...
pinMode(outpins[6],OUTPUT);

and see where that falls on memory usage.
Well, that's what I'm getting at.

Both methods are in the code block I posted here but only one is used while the other is commented out during compile. If I compile with the array in use it costs 14 bytes more than compile while using pinMode(pin,OUTPUT); with the array method commented out.
Personal bytes are finite, save room for play.

AWOL

Enjoy
Code: [Select]
//#define TABLE

#ifdef TABLE
const byte pins [] = { 2, 3, 4, 5, 6, 7};
const byte nPins = sizeof pins / sizeof pins [0];
#else
#define pinA 2
#define pinB 3
#define pinC 4
#define pinD 5
#define pinE 6
#define pinF 7

#endif

void setup()
{
#ifdef TABLE
  for (byte i = 0; i < nPins; i++) {
    pinMode (pins [i], OUTPUT);
    digitalWrite (pins [i], LOW);
  }
#else
  pinMode (pinA, OUTPUT);
  digitalWrite (pinA, LOW);
  pinMode (pinB, OUTPUT);
  digitalWrite (pinB, LOW);
  pinMode (pinC, OUTPUT);
  digitalWrite (pinC, LOW);
  pinMode (pinD, OUTPUT);
  digitalWrite (pinD, LOW);
  pinMode (pinE, OUTPUT);
  digitalWrite (pinE, LOW);
  pinMode (pinF, OUTPUT);
  digitalWrite (pinF, LOW);
#endif
}

void loop()
{
#ifdef TABLE

#else

#endif
}
"Pete, it's a fool (who) looks for logic in the chambers of the human heart." Ulysses Everett McGill.
Do not send technical questions via personal messaging - they will be ignored.
I speak for myself, not Arduino.

CopperDog

#5
Mar 30, 2019, 09:15 pm Last Edit: Mar 30, 2019, 09:16 pm by CopperDog
Hmm, not a huge savings but your method does save 4 bytes over the original array method I used. The cost is simply in the datatype used in the array

By taking your suggestion to use byte rather than int in the pin array I shaved off 8 bytes

I'll stick with the array method however which needs less code lines.

It made me assess the entire sketch and swap out less for less memory intensive data types where possible. That shaved 2% of of dynamic memory use.

Thanks for the input.
Personal bytes are finite, save room for play.

Jiggy-Ninja

The extra RAM usage is because in the for loop you are accessing the array with a value that is not a compile-time constant (outPins[index]). The compiler has no way of predicting what this value is going to be, so in order for it to work properly it has to load the entire array into RAM memory and do the usual pointer arithmetic to get the proper value.

If you access the array with a compile time constant (outPins[3]), the compiler knows exactly what element you're referencing and can use that directly instead of having to perform memory access.

The challenge then, is how can you loop through an array accessing it with just compile-time constants. The short answer is that you can't. The longer answer is that while you can't do it with a loop, you can do it with template functions and recursion.

WARNING: advanced content ahead!
Code: [Select]
const int outPIN_COUNT = 7;
const int outPins[outPIN_COUNT] = {3, 4, 5, 6, 8, 9, 10}; //output pin array

void setup() {
  // put your setup code here, to run once:
  setPinModes<outPIN_COUNT>(outPins, INPUT);
 // for( int i=0; i<outPIN_COUNT; ++i) pinMode(outPins[i], INPUT);
}

template<byte N>
void setPinModes(const int* pinArray, byte modeToSet)
{
  setPinModes<N-1>(pinArray, modeToSet);
  pinMode(pinArray[N-1], modeToSet);
}

template<>
void setPinModes<0> (const int* pinArray, byte modeToSet)
{
  // empty to terminate the recursion.
}

void loop() {
  // put your main code here, to run repeatedly:

}

Compiling this sketch uses 9 bytes of global memory, exactly the same as a blank one. Exchanging the recursive template function with the commented out loop adds 14 bytes of ram and even uses 2 more bytes of program space (626 -> 628).

Template parameters (in the <> angle brackets) must be compile time constants, so the compiler will always know exactly how far this function will recurse. The compiler also knows exactly what value is being accessed each time, so it's able to optimize away all the code required to access the array and use the values directly.

This is proven out by the actual assembly code generated by the compiler, which just loads the pin values directly and calls pinMode with each of them in sequence.
Code: [Select]

 236: 83 e0       ldi r24, 0x03 ; 3
 238: 0e 94 66 00 call 0xcc ; 0xcc <pinMode.constprop.6>
 23c: 84 e0       ldi r24, 0x04 ; 4
 23e: 0e 94 66 00 call 0xcc ; 0xcc <pinMode.constprop.6>
 242: 85 e0       ldi r24, 0x05 ; 5
 244: 0e 94 66 00 call 0xcc ; 0xcc <pinMode.constprop.6>
 248: 86 e0       ldi r24, 0x06 ; 6
 24a: 0e 94 66 00 call 0xcc ; 0xcc <pinMode.constprop.6>
 24e: 88 e0       ldi r24, 0x08 ; 8
 250: 0e 94 66 00 call 0xcc ; 0xcc <pinMode.constprop.6>
 254: 89 e0       ldi r24, 0x09 ; 9
 256: 0e 94 66 00 call 0xcc ; 0xcc <pinMode.constprop.6>
 25a: 8a e0       ldi r24, 0x0A ; 10
 25c: 0e 94 66 00 call 0xcc ; 0xcc <pinMode.constprop.6>

Enjoy!
Hackaday: https://hackaday.io/MarkRD
Advanced C++ Techniques: https://forum.arduino.cc/index.php?topic=493075.0

Montmorency

#7
Apr 20, 2019, 08:15 pm Last Edit: Apr 20, 2019, 08:22 pm by Montmorency
Why does using an array to set pin states use 14 bytes more memory than setting each pin individually ?
Incidentally, I just answered this question in an unrelated thread
https://forum.arduino.cc/index.php?topic=611227.msg4144656#msg4144542

When you use immediate values or declare individual `#define`, `enum` or `const` objects for your pin numbers, the compiler treats them as compile-time constants for embedding directly into the machine instructions. These constants normally never "materialize" in data memory and do not occupy data memory.

But when you declare an array (and a non-constant one! and with external linkage!!!) in your version of the code the compiler has no choice but to "materialize" this array in memory. These are your extra 14 bytes: an array of 7 `int` values.

If you declare your array as `const` or `constexpr`

Code: [Select]

const int outPins[outPIN_COUNT] = {3, 4, 5, 6, 8, 9, 10};


and crank-up optimization options of the compiler, there is a change that the compiler will unwrap the cycle and ultimately discard this array.

But Arduino IDE normally passes different optimization settings to avr-gcc compiler. And it might be quite right in doing that. Note that in the array version what you are losing in data memory, you are almost regaining in code memory - the code size is 8 bytes smaller. In larger programs this effect might be skewed in favor of the array version even more.

CopperDog

Excellent answers from both Jiggy-Ninja and Montmorency, thanks much.

Intuitively I felt it was associated with indexing the array but not how that made the difference. Very good explanations. I'll certainly be considering a template any time an array is needed from now on. 
Personal bytes are finite, save room for play.

dougp

Compiling this sketch uses 9 bytes of global memory, exactly the same as a blank one.
Compiler (IDE v1.8.2) gives this error

Code: [Select]

template<>
void setPinModes<0> (const int* pinArray, byte modeToSet)
//
//               ^

//exit status 1
//expected initializer before '<' token



Possibly a version issue?
Everything we call real is made of things that cannot be regarded as real.  If quantum mechanics hasn't profoundly shocked you, you haven't understood it yet. - Niels Bohr

No private consultations undertaken!

oqibidipo

#10
Apr 21, 2019, 07:47 am Last Edit: Apr 21, 2019, 07:52 am by oqibidipo
Works OK in IDE 1.8.9.

Prototype generation fails in older versions with multiline templates. Put them on the same line:

Code: [Select]
template<byte N> void setPinModes(const int* pinArray, byte modeToSet)
{ ...
}

template<> void setPinModes<0> (const int*, byte)
{
  // parameter names omitted to avoid "unused parameter" warning
  // empty to terminate the recursion.
}

dougp

Prototype generation fails in older versions with multiline templates. Put them on the same line:
That's the ticket!  Karma++.  I seem to remember this happening also when putting return type and function definition on separate lines.  Never knew why.

Now, to work out just is happening here.
Everything we call real is made of things that cannot be regarded as real.  If quantum mechanics hasn't profoundly shocked you, you haven't understood it yet. - Niels Bohr

No private consultations undertaken!

Montmorency

I'll certainly be considering a template any time an array is needed from now on.
Well, ahem... You see, when Arduino IDE invokes GCC compiler, it configures it with `-Os` flag, which means "optimize for size". It directs GCC to generate the smallest possible code, even at the expense of the code's performance.

The fact that GCC in this case did not unwrap the cycle and did not eliminate the array by itself, is likely a consequence of that `-Os` flag. Indeed, in this case a genuine cycle (which in turns requires a genuine array in memory) produces smaller code.

And this is done for a reason: saving code space is as important on Arduino as saving data space.

By using this template technique you are actually explicitly blocking the compiler from generating a cycle and forcing it to generate the [larger] "unwrapped" code. This is a valuable technique, but don't just use it thoughtlessly and unconditionally "any time". You might end up losing more than you gained. That `-Os`is there for a reason.

PaulMurrayCbr

Why does using an array to set pin states use 14 bytes more memory than setting each pin individually ?
Because using constants compiles down to a short machine-language instruction. Using an array means that the compiler has to code up a loop, and then a fetch instruction inside the loop.

Oh, and you are using 16-bit ints for the array elements.
http://paulmurraycbr.github.io/ArduinoTheOOWay.html

Montmorency

#14
Apr 22, 2019, 04:13 pm Last Edit: Apr 22, 2019, 04:14 pm by Montmorency
Because using constants compiles down to a short machine-language instruction. Using an array means that the compiler has to code up a loop, and then a fetch instruction inside the loop.
Using a `for` loop in the code is what makes compiler "code for a loop". Using an array by itself does not produce any loops.

As the examples in this thread show, it is possible to use an array and yet make the optimizing compiler completely discard it, producing the very same "short machine-language instructions" as in the non-array version.

Go Up