I was wondering how the compiler handles procedures/functions?
What I mean by this:
Let's say you have a piece of code that gets called multiple times in a program and you decide to create a function/procedure with that code (depending on what the code does obviously) will the compiler store that piece of code once and jump to that location whenever the code needs to be executed or will it insert the same piece of code everytime it needs to run?
I know the logical answer is it jumps to the exact same piece of code that's stored once in flash,executes it and returns but it looks like it doesn't work like that.
Since resources are limited I'd like to code as efficient as possible but I get the feeling that each time I call a procedure/function more code then just a jump/return is added to the compiled binary.
Am I just misjudging this or is this actual compiler behaviour and if so, why?
It depends. You are describing the classic behaviour. But for performance reasons sometimes the compiler will inline the function and just put in the code.
Yes, you have to save the state on the stack when calling, and pop it on return. Compilers are
free to optimize calls to functions how they want, and typically will handle leaf-case functions
more efficiently (a leaf-case function doesn't itself call any functions).
These days calling conventions usually split the machine registers into two groups, one to be
saved/restored by the callee (the function called) and the other to be saved by the caller. The
callee only needs to save the callee-save registers that it actually wants to use, and the caller
only needs to save the registers it uses. Thus if you have functions without local variables
there is little if any stack traffic involved other than the return addresses.
Note that when compiling a library the compiler cannot optimize so much as the calling
convention has to work with programs compiled separately. If you recompile everything
at once you can do more optimization since calling conventions can be tuned differently for
different functions (the number of callee-save v. caller-save registers can be
varied for one thing)
If I'm understanding this correctly the compiler's behaviour depends on the stack depth used by the procedure/function that gets called?
If to much data needs to get pushed onto the stack it decides to inline the code, correct?
I've had a quick look at the datasheet of the AtMega 328P and from what I've seen the stack is just in SRAM and not limited to certain depth as long as there's enough SRAM to contain it.
So when does the compiler decide to inline the code?
I'm using 24% of my SRAM, plenty of space for quite some stack depth left, yet it still decides to inline it, why?
Also when you say "for performance reasons" what exactly do you mean?
How much overhead can a jump/return cause in the AtMega?
Isn't it supposed to be single clock cycle instruction execution?
(btw, I'm comming from an x86 Assembly background in case you haven't figured that out yet. )
@Groove: of course it's just in SRAM, bad choice of words I guess, the point was that its depth (size) isn't limited in any way, except by the available SRAM, according to the datasheet, so why not use more SRAM for stack space instead of using more flash by inlining?
Not that it really matters to me but it's just unexpected compiler behaviour IMO, but I'm using compilers from way back in the days as reference to base my assumption on.