How much free SRAM does one need?

When compiling code the arduino IDE shows a figure for how much SRAM is taken up by global variables, the free amount being left over for local variables and function calls.How can one assess what is an appropriate amount of free memory to have for these purposes?

I'm trying to develop a piece of code for an ATTiny84, and as that only has 512 bytes of SRAM I've rather less to work with than the Atmega328p's 2Kbytes. So I want to know, as I'm developing the code, where to be keeping an eye out. I'm sureI can keep the global variables under the limit, but I don't know by how much yet.

I also hear than on the Atmega328P there is a requirement for atleast 128 bytes of SRAM to be kept free as a safety margin, if I have to keep 128 bytes free, on top of those needed to be kept free for local variables and function calls, then 512 bytes doesn't leave much left.

I understand that properly knowing the peak SRAM requirement comes pretty close to needing to be able to beat some of Turing's original hypotheses, "does the machine stop" and all that. But there must be ways to get a good feeling for how much free SRAM is needed.

Lets say that you make use almost exclusively of global variables*, any locals are nothing but a few uint8_t s scattered in one or two places in functions, are you then in a situation where so long as the amount of free memory available is greater than the number of uint8_t locals that exist throughout the whole program** one will be fine?

*this way they get counted for the stats at compile time, rather than being surprises which happen when running the code

**being clear to state that these locals are NOT all in the same scope, so there is no logical place in a program where you could actually need all of them at once. Hence would (free SRAM) > (total SRAM needed for all local variables, even though they don't all get used at once) guarantee stability?

SRAM bytes can also apparently neede for function calls, to what extent? Just one byte per layers of function, so if your deepest function is that void setup() calls uint8_t StartCalibration(uint8_t input1, uint16_t input2), and StartCalibration may make a call to void ToggleThePins(), is three bytes SRAM all this circumstance would take, for three nested functions? Or is SRAM required depending on the return types of the functions and the number of bytes required for the arguments they take?

Is there anything else that requires SRAM bytes to be free for use? I can count up global variables, local variables and function calls simply by ctrl+f ing for type names, is there anywhere that things tke up space in the SRAM without easily searchable defining lines like these? Assuming no use of malloc, and assuming all arrays used use a compile time constant for their size.

If (global variables)+(total bytes of all local variables)+(maximum level of nesting of functions, even if you have an interrupt going on whilst another function is deeply nested) is less than the total SRAM available then is everything alway fine?

Failing all else, if one writes code to entirely use global variables, no locals called at all, then so long as the sum of the global variables bytes and the count of maximum nested function calls is less than the total the chip can provide, are things good?

Thank You

P.S. the code is not ready yet, I'm trying to get an undertanding for this mattr as early as possible. This way I can have rules of thumb to guide me in development, rather than trying to debug a problem when I find horrible weird crashes happeneing because certain actions in a program might consume all the SRAM.

I use this utility occasionally (on a NANO). There might be something similar for your ATTiny84, so maybe read through and pick up a few keywords to use in a search?

Sorry, can't be much help beyond that.

There’s no hard and fast ‘rule’, because it depends on the complexity and calling strategies used by each program.

I use 85% as a target, but annythin over 90% is usually too risky.

You need to look at the stack & heap pointers to ensure they don’t collide.

I use as few global symbols as possible and I use the "F" macro for all printing when possible. I avoid strings other than in print statements if at all possible. When using local variables they are allocated when the function is called and released when the function is excited and therefore available to other functions. I also use byte variables when possible. Hope this helps.

1 Like

It does not look like that link has been updated since I reported an issue with it to Adafruit. See Classic Nano has too much RAM

Life is complicated and I only have scratched the top of the ice berg :wink:

Your StartCalibration has two arguments, an uint8_t and an uint16_t; that is 3 bytes. In theory those will be placed on the stack but the processor has a number of registers where those arguments can be placed by the compiler; I suspect that those registers will be used unless all registers are used.

The call to StartCalibration will push the program counter (PC) onto the stack so the processor knows where to continue once the function is finished; count 2 bytes for that (on a 32-bit processor 4 bytes, on a processor with only 256 bytes of flash it might be 1 byte).

From StartCalibration you call ToggleThePins(); no parameters but the PC will be pushed on the stack. So two bytes extra.

Problem in determining the memory usage comes when e.g. those two functions are only called from setup(). avr-gcc can be extremely clever and calls of functions can be replaced by jump instructions which don't push the PC.

There is probably a lot more to this but above is basically what I think I know.

Notes:

  1. Pushing the PC on the stack is a hardware functionality.
  2. When calling core functions (digitalWrite, Serial.print etc), those functions can call other functions under the hood.
  3. If you want to analyse further, you can run avr-objdump -h -S pathTo/yourSketch.ino.elf and analyse the resulting output; be prepared, very very very prepared. As mentioned at the top, "life is complicated" :smiley:

I've just compiled the standard blink example for an ATtiny84. If you "export compiled binary", you will also get the lst file (as with avr-objdump -h -S pathTo/yourSketch.ino.elf).

The result is relatively simple; below loop()

// the loop function runs over and over again forever
void loop()
{
  digitalWrite(LED_BUILTIN, HIGH);  // turn the LED on (HIGH is the voltage level)
 1bc:	81 e0       	ldi	r24, 0x01	; 1
 1be:	96 df       	rcall	.-212    	; 0xec <digitalWrite.constprop.0>
C:\Users\Wim\Documents\Arduino\Forums\1172889\Blink_ATtiny84/Blink_ATtiny84.ino:36
  delay(1000);                      // wait for a second
 1c0:	67 df       	rcall	.-306    	; 0x90 <delay.constprop.1>
C:\Users\Wim\Documents\Arduino\Forums\1172889\Blink_ATtiny84/Blink_ATtiny84.ino:37
  digitalWrite(LED_BUILTIN, LOW);   // turn the LED off by making the voltage LOW
 1c2:	80 e0       	ldi	r24, 0x00	; 0
 1c4:	93 df       	rcall	.-218    	; 0xec <digitalWrite.constprop.0>
C:\Users\Wim\Documents\Arduino\Forums\1172889\Blink_ATtiny84/Blink_ATtiny84.ino:38
  delay(1000);                      // wait for a second
 1c6:	64 df       	rcall	.-312    	; 0x90 <delay.constprop.1>
 1c8:	f9 cf       	rjmp	.-14     	; 0x1bc <main+0x20>

From this snip, note that the end of loop does not contain a ret instruction to return to main() but a jmp instruction to return to the beginning of loop() (address 0x1bc).

I'll leave the rest up to you.

not only count your user defined functions, but also all functions in the background used within the Arduino Framework.

if you would show your current code (it should compile) some might give you hints where/how to spare SRAM.

1 Like

In terms of functions then, I have to count the maximum number which can ever be called in a nested manner? So two independent functions which don't make calls to each other could each only account for 2 bytes (assuming this PC counter needs 2 bytes in size as there is >256 bytes of memory) when called. Whereas a function which calls another could account for 4 bytes in this manner.

And these counts go in to the free space, they aren't already accounted for at compile time when the IDE adds up the bytes used by global variables?

As for arduino library defined functions, I assume only those I actually call, or which get called by other functions in the arduino core libraries, matter. If I don't use Serial for anything, it will be ignored by the compiler? Other functions in those libraries, defined inside the same h files, but never called in the sketch, won't account for any potential for SRAM usage? So if I use the pinMode funtion, but then do all by pin stuff by direct port manipulation, never calling digitalWrite or digitalRead, I'd need to have 2 bytes of free SRAM for this, and another 2 bytes because it takes 2x uint8_t s as arguments, but I'd have no need for having bytes spare for turnOffPWM or digitalWrite.

If one makes no use of delay() or millis(), but does use delayMicroseconds() and micros(), can one avoid the amount of SRAM needed to account for the millisecond timer, interrupt and other associated functionality?

When an array is passed in to a function, what then? Just one byte of extra SRAM taken up by a pointer to that array?

Does this vary by means of passing?
How does it go for

uint16_t DoSomething(char *s, byte HowLong){
  uint16_t value=0;
  //do stuff
  return value;
}

Which then gets called as:

uint8_t ArrayIn[40]={0}; //this is global
//various stuff done to elements in ArrayIn
uint16_t Output=crc_string(ArrayIn,sizeof(ArrayIn));

As versus for functions defined and used as

uint8_t DataSend(uint8_t ToSend[], uint8_t ToSendLength){
 //do stuff
 return 1;
}
uint8_t NewMessageBytes[20]={0}; //this is global
uint8_t NewMessageLengthCount=sizeof(NewMessageBytes);
//do stuff
uint8_t validity=DataSend(NewMessageBytes,NewMessageLengthCount);

Also does memmove require any additional SRAM, beyond that already occupied by the arrays it is copying/filling?
memmove(ArrayInOld,ArrayIn,sizeof(ArrayInOld));

Also, global variables versus static variables used in one function. Which is really better? With the global "you know where you stand", whereas if the static local one gets created at a certain moment, is it not the case that everything "below" it on the heap at the moment of the static local's creation becomes inaccessible if later freed?

And when one uses interrupts, pin change for example, I assume that the worst case scenario one must account for becomes:
program is running at the deepest point of nested functions, with all locals associated with thosefunctions taking up SRAM as weel as 2 bytes per function... when suddenly an interrupt fires needing another 2 bytes(?) for the interrupt's own PC counter, plus needing bytes for any locals declared in the ISR, and needing bytes enough free for any functions that interrupt might call?

If

(global variables)+2*(deepest number of nested layers of functions possible)+(total of locals declared in all functions in that chain of nesting)+(2bytes for an ISR)+(total locals in ISR)+2(if the ISR called only functions which don't have any others nested within them)+(total locals in those functions)

is less than the total bytes (512 for ATTiny84, programmed over icsp, no bootloader) of SRAM available,

everything will be ok?

Thank you everyone

Yes

Yes

Indeed; the compiler just compiles; it does not do an analysis of the nesting.

If you use Serial.print() but don't use Serial.read(), only Serial.print() will be in the final executable.

The pointer will be two bytes in your case.

Not sure what you mean.

I do not know.

Static variables are not created when you call the function; they are like global variables and count in the total memory usage. Easy to test :wink:

ISRs are like functions. They will however have to save any memory (registers) that they want to use for themselves; so those get pushed on the stack. And if you use a local variable in there, it will count.

I suggest that you do a little bit of analysis of the lst file on a simple (but not too simple) sketch.

1 Like

check digital_wiring.h
For an UNO just as an example: https://github.com/arduino/ArduinoCore-avr/blob/master/cores/arduino/wiring_digital.c#L138
I count 5 variables for digitalWrite, and some additional functions call

Regarding ISR

Using below code

const uint8_t intPin = 2;
const uint8_t ledPin = 3;

void setup()
{
  pinMode(intPin, INPUT_PULLUP);
  pinMode(ledPin, OUTPUT);

  attachInterrupt(digitalPinToInterrupt(intPin), theISR, FALLING);

}

void loop()
{
  // put your main code here, to run repeatedly:
}

void theISR()
{
  digitalWrite(ledPin, HIGH);
}

The start of the generated ISR is

ISR(EXTERNAL_INTERRUPT_0_vect)
{
  e2:	1f 92       	push	r1
  e4:	0f 92       	push	r0
  e6:	0f b6       	in	r0, 0x3f	; 63
  e8:	0f 92       	push	r0
  ea:	11 24       	eor	r1, r1
  ec:	2f 93       	push	r18
  ee:	3f 93       	push	r19
  f0:	4f 93       	push	r20
  f2:	5f 93       	push	r21
  f4:	6f 93       	push	r22
  f6:	7f 93       	push	r23
  f8:	8f 93       	push	r24
  fa:	9f 93       	push	r25
  fc:	af 93       	push	r26
  fe:	bf 93       	push	r27
 100:	ef 93       	push	r30
 102:	ff 93       	push	r31

Every push costs you one byte !! This excludes the push of the PC on the stack.

// Edit
I also found

0000004e <__vector_11>:
__vector_11():
C:\Users\Wim\AppData\Local\Arduino15\packages\ATTinyCore\hardware\avr\1.5.2\cores\tiny/wiring.c:308
      #error "cannot find Millis() timer overflow vector"
    #endif
  #else
    #error "Millis() timer not defined!"
  #endif
  {
  4e:	1f 92       	push	r1
  50:	0f 92       	push	r0
  52:	0f b6       	in	r0, 0x3f	; 63
  54:	0f 92       	push	r0
  56:	11 24       	eor	r1, r1
  58:	2f 93       	push	r18
  5a:	3f 93       	push	r19
  5c:	8f 93       	push	r24
  5e:	9f 93       	push	r25
  60:	af 93       	push	r26
  62:	bf 93       	push	r27

for the millis() timing. Vector 11 is the overflow interrupt for timer 0.

Thanks.

"Every push costs you one byte"

Does this mean if I compile the program then find how to inspect the assembly code I can count up all the pushes, knowing which functions can nest, and work out what the maximum possible peak memory usage can be?

Also, can anyone advise how much SRAM is used up by the sin(), cos() and atan2() functions (where input agruments and the output are all standard size, for the 328p, 4 byte float variables) when run, that is to say eyond the SRAM used for the input and output floats? I understand that underneath they are highly optimised assembly code, so I couldn't very wel try to modify the library to interupt them mid-way and use an arduino function which work out free memory to compare to before the function begins.

Ad do static inline functions act more like #define and therefore not take the 2 bytes of SRAM upon entry that normal functions do?

Thanks

I think you can; I've never had to go into that detail.

No idea. I suggest that you write a test sketch that loops through a number of angles and prints the sin(angle) and inspect the generated code.

To my knowledge inline is just a suggestion to the compiler; the compiler does not have to follow it.

Regarding the other possible method of seeing how much free SRAM is needed, that is to say adding a volatile global array to the code and increasing its length a byte at a time until the code "crashes" when running...

Is such a "crash" obviously detectable, or can it result in weird bugs which might be harder to detect, like variables getting overwritten? If the ATTiny or ATmega resets or gives some sort of "error code" I could view I'd know such a running out of memory had occured, but if it doesn't I can't see an easy way to detect how big the array would have to be made so as to causea crash, and thereby dertremine how much free SRAM there is when dynamic use of SRAM is at peak.

Also, where can I learn more about viewing and interpreting that type of assembly code you show? I use a Linux PC.

I once ran:
bin/avr-objdump -D file_name.ino.elf > example.txt
in terminal once, to make a txt file to inspect which was supposed to contain this assembly, but it wasn't as commented with markers of where in the assembly matches to where in C code as the ones you've shown here.

Thanks

See post #6 and post #7: avr-objdump -h -S or export the compiled binary (for the ATtiny).

Memory issues can show their face in different ways. There is no way to predict what will happen.

It can overwrite a variable, it can overwrite the return address on the stack of the function that is currently executing so once that function is finished it will return to the wrong address and start executing from there.

From experience, you can be reasonably sure that you have a memory issue when adding or removing a println results in unexpected behaviour. And another one is when your code does not seem to get past setup() (usually detected based on print statements).

But an AVR won't raise some convenient error detection, or behave in a really obvious way like resetting itself when a memory problem happens?

No.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.