IMHO the sram usage would be less when using pointers, but are there any speed improvements ?
Instinctively I would have said the contrary, that the pointer version is worse in all aspects. But instincts don't really count for anything in optimisation, so I spent a little time on the problem.
My first conclusion I reached is, that it's not obvious to write a little test program that the compiler wouldn't optimise away. In the end I used this code:
void hubba(unsigned char* chr){
delay (10);
*chr = laps & 0xf3;
}
unsigned char bubba(){
delay (14);
return laps & 0xf5;
}
...
unsigned char c;
hubba (&c);
Serial.println (c);
c = bubba ();
Serial.println (c);
The delay and the Serial.println are used to stop the compiler from getting too smart.
The functions themselves no look like this in assembler:
unsigned char bubba(){
delay (14);
118: 6e e0 ldi r22, 0x0E ; 14
11a: 70 e0 ldi r23, 0x00 ; 0
11c: 80 e0 ldi r24, 0x00 ; 0
11e: 90 e0 ldi r25, 0x00 ; 0
120: 0e 94 b0 01 call 0x360 ; 0x360 <delay>
124: 80 91 36 01 lds r24, 0x0136
return laps & 0xf5;
}
128: 85 7f andi r24, 0xF5 ; 245
12a: 08 95 ret
The return value is returned in r24 and not via ram, which makes the code quite small.
void hubba(unsigned char* chr){
12c: 0f 93 push r16
12e: 1f 93 push r17
130: 8c 01 movw r16, r24
delay (10);
132: 6a e0 ldi r22, 0x0A ; 10
134: 70 e0 ldi r23, 0x00 ; 0
136: 80 e0 ldi r24, 0x00 ; 0
138: 90 e0 ldi r25, 0x00 ; 0
13a: 0e 94 b0 01 call 0x360 ; 0x360 <delay>
*chr = laps & 0xf3;
13e: 80 91 36 01 lds r24, 0x0136
142: 83 7f andi r24, 0xF3 ; 243
144: f8 01 movw r30, r16
146: 80 83 st Z, r24
}
148: 1f 91 pop r17
14a: 0f 91 pop r16
14c: 08 95 ret
The code is longer and takes more cycles, for a big part I guess because of moving stuff on the stack and back. If the functions were more complicated, I guess that the first version will also need it so they'd end up pretty even.
My conclusion is, that a return by value has a lot of potential benefits like as return values registers, inlining, storing local variables in registers and easier readability but I wasn't really able to point out any downsides. At worst, it will behave as efficient as the version with pointers.
The other question to ask, what's the point of optimising around with parameter passing. If the call overhead is relevant and worth optimisation, the handling of the stack frames and the function call itself will use up a lot more than the return value handling. You probably will need to get rid of the function or inlining it. And if the call overhead isn't a problem, optimising the return value passing is a waste of time.
So when is passing by reference preferable? Here we come back to the problem that creating a significant test situation. For a simple variable types the pointer overhead makes it only worse. The compiler will handle them via registers and save the processor all the loading and storing. But as soon as you move to bigger return values that need to be passed back on the stack and copied around, for example arrays or classes, the pointer might become more efficient and use less RAM. While I doubt the efficiency will matter much, the benefits of using less RAM can become critical.
My final answer to the question thus is: It depends and don't make your program more complicated for a performance gain you can't measure.
Korman