Trying To Make Sense of RAM Usage....

And finding some surprising, and confusing, things. This thread will likely be of little interest of most people, who will, if lucky, never NEED to learn the things I’m trying to learn. I am trying to get a solid understanding of exactly how memory allocation is done in the Arduino, how the compiler estimates memory usage, some of the optimizations being performed, and how to achieve a minimal memory footprint. So, this will be something of a stream-of-consciousness thread, as I report what I see and learn (and, no doubt at times, as I get more and more confused…). One thing that is already clear, is the compiler optimizations are quite impressive! So, here goes…

Trying to shoe-horn some code into a 328, I’m struggling to get RAM usage down. Right now, I’m trying to make sense of exactly how RAM is allocated, reported, and used, as I sometimes find HUGE amounts of RAM disappearing for no good reason that I can see. I am sure there is good reason, and I want to understand what that reason is. I am finding it is even stranger, and more complex, than expected. Here is a surprising, albeit simple, example I stumbled across by accident:

First, I modified the freeMemory library to add a new function to just dump what’s there:

void showMemory(const char *s)
{
 char __stack = 0;
 
 Serial.println(s);
 Serial.print("__brkval=");
 Serial.println((unsigned int)__brkval);
 Serial.print("__flp=");
 Serial.println((unsigned int)__flp);
 Serial.print("__stack=");
 Serial.println((unsigned int)&__stack);
 Serial.print("stack size=");
 Serial.println(RAM_end - (unsigned int)&__stack);
 Serial.print("Heap size=");
 Serial.println((unsigned int)__brkval - RAM_start);
 
 
 struct __freelist* current;
 int total = 0;
 for (current = __flp; current; current = current->nx)
 {
 total += 2; /* Add two bytes for the memory block's header  */
 total += (int) current->sz;
 Serial.print("mblk: sz=");
 Serial.print((unsigned int)current->sz);
 Serial.print(" nx=");
 Serial.println((unsigned int)current->nx);
 Serial.print("Total: ");
 Serial.println(total);
 }
 Serial.println("\n");
}

So, that shows be where the “break” between stack and heap is, and the current stack pointer. Note than in a 328, RAM starts at address 0x100 (256 DEC), and ends at address 0x8FF (2303 DEC).

First, I compile and run this trivial sketch:

void setup(void)
{
    Serial.begin(38400);
    Serial.print(F("Starting...\n\n"));

    showMemory(F("Initial"));
}

void loop()
{
}

This produces the following output:

Starting...

Initial
__brkval=0
__flp=0
__stack=2299
stack size=4
Heap size=65280

That mostly makes sense. The max stack size is 4 bytes, from 2299->2303. I don’t understand __brkval being 0, especially given that the compiler memory report indicates “436 bytes of dynamic memory used”. But, __brkval being 0 is why the displayed “heap size” is obviously wrong, since the compiler reports something north of 400 bytes of RAM used. We’ll come back to that later.

Now, I add a few more lines of code to the sketch:

const char *cfgFile = "temp.cfg";

void setup(void)
{
    Serial.begin(38400);
    Serial.print(F("Starting...\n\n"));

    showMemory(F("Initial"));

    if(SPIFFS.exist(cfgFile))
        Serial.println(F("cfgfile exists"));
    else    
        Serial.println(F("cfgFile does not exist"));

    showMemory(F("Final"));
}

void loop()
{
}

Now, notice a few things:

There is a call to SPIFFS.exists(), which is a static member function of my FS library. The functionality of this library, and function, is not important at the moment. When I compiled this sketch, I expected to get a compiler error, since SPIFFS is not really defined anywhere in the sketch. To my surprise, it compiled without error! Here’s why:

First, as I said, exists() is a static member function, so no instance of FS is required, to make the function usable. But how does it find THAT exists(), when there is no instance of FS named SPIFFS? The answer is, there is an “extern” statement in FS.h, which tells the compiler there should be an instance of FS named SPIFFS somewhere, so it makes that connection. Since the function is static, it doesn’t require and since of the class, so both compiler and linker are happy! This really surprised me!

So, what happens when that sketch is run? This:

Starting...

Initial
__brkval=0
__flp=0
__stack=2252
stack size=51
Heap size=65280


exists("temp.cfg")
File 0 does match
cfgfile exists
Final
__brkval=0
__flp=0
__stack=2252
stack size=51
Heap size=65280

Stack size has increased from 4 to 51 bytes, a difference of 47 bytes. Most of this can be explained by the fact that exists has 34 bytes of local variables. I’m not perfectly clear on where the other 17 bytes went, but exists calls some other static functions within FS, and also the EEPROM library, so it is safe to assume those 17 bytes are to accommodate the stack needs of those functions. But, note __brkval is still 0! What’s up with that?

Let’s add two more lines to the sketch:

void setup(void)
{
    Serial.begin(38400);
    Serial.print(F("Starting...\n\n"));

    showMemory(F("Initial"));

    FS myFS1 = FS();
    myFS1.dump();

    showMemory(F("Final"));
}

The call to FS::dump() is only there to prevent myFS1 from being optimized away. This produces:

Starting...

Initial
__brkval=0
__flp=0
__stack=2249
stack size=54
Heap size=65280


// FS:dump() output deleted
 
Final
__brkval=722
__flp=0
__stack=2249
stack size=54
Heap size=466

The stack has moved down by 3 bytes, the exact size of the member data in FS, which makes perfect sense, since FS is created on the stack, as a local variable. And, what do you know, suddenly, __brkval is changed also! To 722??? As it turns out, 722 = 466 + 256. 256 is the start address of RAM in the 328P, and 466 is the exact RAM size reported by the compiler. So, now the heap starts at address 256 and ends at address 722. But why does __brkval change at all? myFS1 is created on the stack, as a local variable, so should have no impact on the heap!

Let’s try this again, but this time put myFS1 in the heap:

void setup(void)
{
    Serial.begin(38400);
    Serial.print(F("Starting...\n\n"));

    showMemory(F("Initial"));

    FS *myFS1 = new FS();
    myFS1->dump();

    showMemory(F("Final"));
}

Here’s what we get:

Starting...

Initial
__brkval=0
__flp=0
__stack=2252
stack size=51
Heap size=65280

// FS::dump() output deleted

Final
__brkval=727
__flp=0
__stack=2252
stack size=51
Heap size=471

Stack size has been reduced by 3 bytes, since the FS member data is now stored on the stack. Heap size has been reduced by 5 bytes. 3 of those are the FS member data, and other 2 bytes are, I guess, the linked list pointer to the next memory block?

In any case, both of these sketches change __brkval from 0 (an illogical value) to a sensible, non-zero value? So, why is __brkval EVER set to 0? What does that signify?

Let’s try one more thing:

FS myFS = FS();

void setup(void)
{
    Serial.begin(38400);
    Serial.print(F("Starting...\n\n"));

    showMemory(F("Initial"));

    myFS.dump();

    showMemory(F("After myFS.dump()"));

    FS myFS1 = FS();
    myFS1.dump();

    showMemory(F("Final"));

}
[/code]

Here, I’ve added a second instance of FS, in global space. But the showMemory response is really surprising:

Starting...

Initial
__brkval=0
__flp=0
__stack=2284
stack size=19
Heap size=65280

// FS:dump() output deleted

Final
__brkval=743
__flp=0
__stack=2284
stack size=19
Heap size=487

OK… __brkval is zero, UNTIL FS:dump() is called??? WTF??? What is triggering the change to __brkval?

To be continued…

Regards,
Ray L.

RayLivingston:
First, I compile and run this trivial sketch:

This does not compile for me, because you can't pass a __FlashStringHelper* to a function accepting only char*.

RayLivingston:
That mostly makes sense. The max stack size is 4 bytes, from 2299->2303. I don't understand __brkval being 0, especially given that the compiler memory report indicates "436 bytes of dynamic memory used". But, __brkval being 0 is why the displayed "heap size" is obviously wrong, since the compiler reports something north of 400 bytes of RAM used. We'll come back to that later.

This makes perfect sense actually. __brkval, as well as __flp are initilialized to zero by the startup code, because they are in the bss segment. The memory report of the compiler is somewhat misleading, because it does not report dynamic memory usage, but rather the amount of SRAM (=dynamic memory) you used; the amount reported is just the sum of the length of the data and bss segment. This is static data, their values are known at compile time (data) or unknown, but will be initilialized to zero by the startup code (bss).

You can't calculate the heap size like you did, because the heap does not start at the beginning of the ram. Instead, the memory layout looks like this:

|----------|
|  .data   |
|----------|
|  .bss    |
|----------|
|   heap   |
|----------|
| free mem |
|----------|
|  ^^^^^   |
|  stack   |
|----------|

Also, you can't give a total available heap amount, because the available heap size depends on the stack depth at the moment you call malloc. Before allocating a block of memory, malloc will check, if the new chunk collides with the stack at the current depth - __malloc_margin. So the calculation of the remaining heap size can be done like so:

extern uint16_t __heap_start;
extern size_t   __malloc_margin;
uint16_t upperLimit = (char*)AVR_STACK_POINTER_REG - __malloc_margin;
uint16_t lowerLimit = ((uint16_t)__brkval == 0) ? (uint16_t)&__heap_start : (uint16_t)__brkval;
uint16_t availableHeapSize = upperLimit - lowerLimit;

RayLivingston:
First, as I said, exists() is a static member function, so no instance of FS is required, to make the function usable. But how does it find THAT exists(), when there is no instance of FS named SPIFFS? The answer is, there is an "extern" statement in FS.h, which tells the compiler there should be an instance of FS named SPIFFS somewhere, so it makes that connection. Since the function is static, it doesn't require and since of the class, so both compiler and linker are happy! This really surprised me!

If you call a static member function of a class, you can also use the scope resolution operator ("::"). Most of the times you don't have an instance laying around.

SPIFFS::exists()

RayLivingston:
Stack size has increased from 4 to 51 bytes, a difference of 47 bytes. Most of this can be explained by the fact that exists has 34 bytes of local variables.

Not quite. This is a strong indicator, that the compiler inlined the call to the exists function (which is perfectly normal, because you call it only once, so why not inline it and save cycles and memory?). A function does not reserve stack space for the function it calls.
It is also possible, that some of the pseudo registers used during compilation could not be put into real registers and thus taking up space on the stack.

But this is all just guesswork until you take a look at the assembly. Also, because of the compiler (and linker) optimizations this might all be untrue. Look at the assembly, if you really need to know what is put on the stack.

RayLivingston:
But, note __brkval is still 0! What's up with that?

Simple, malloc was never called inside the exists function. Remember, that malloc will allocate the first block of memory on the first call. __brkval being zero means, that no memory block was allocated yet.

RayLivingston:
The stack has moved down by 3 bytes, the exact size of the member data in FS, which makes perfect sense, since FS is created on the stack, as a local variable.

Correct.

RayLivingston:
And, what do you know, suddenly, __brkval is changed also! To 722??? As it turns out, 722 = 466 + 256. 256 is the start address of RAM in the 328P, and 466 is the exact RAM size reported by the compiler. So, now the heap starts at address 256 and ends at address 722. But why does __brkval change at all? myFS1 is created on the stack, as a local variable, so should have no impact on the heap!

If __brkval has changed, then dump() has invoked malloc. Either by the new operator or by directly calling malloc. The value of __brkval, being RAM_START + length(.data) + length(.bss), can be easily explained by looking at the memory map above. dump() did some memory allocation, so malloc allocated memory from the beginning of the heap (starting at RAM_START + length(.data) + length(.bss)). After dump() was done, it called free upon that allocated memory block and because this was the only memory block allocated, __brkval was reduced to point to the start of the heap. If this would not have been the last memory block, you'd have seen the __flp point to the now free memory block instead.

RayLivingston:
Stack size has been reduced by 3 bytes, since the FS member data is now stored on the stack. Heap size has been reduced by 5 bytes. 3 of those are the FS member data, and other 2 bytes are, I guess, the linked list pointer to the next memory block?

Correct.

RayLivingston:
In any case, both of these sketches change __brkval from 0 (an illogical value) to a sensible, non-zero value? So, why is __brkval EVER set to 0? What does that signify?

Zero is a logical value. It means that there were no memory blocks allocated yet.

RayLivingston:
Let's try one more thing:
[...] WTF??? What is triggering the change to __brkval?

A call to malloc does.

You can use this sketch to play around with __brkval and __flp.

struct __freelist {
   size_t size;
   struct __freelist *next;
};

extern uint16_t *__brkval;
extern struct __freelist *__flp;
extern uint16_t __heap_start;
extern uint16_t __data_start;
extern uint16_t __bss_start;
extern uint16_t __bss_end;

void show_allocator() {
  if(__flp == 0) {
    Serial.println("There is no freelist.");
  } else {
    Serial.print("The freelist starts at 0x");
    Serial.println((uint16_t)__flp, HEX);
  }

  if(__brkval == 0) {
    Serial.println("No memory has been allocated yet");
    Serial.print("The heap starts at 0x");
    Serial.println((uint16_t)&__heap_start,HEX);
  } else {
    Serial.print("The current end of heap is at 0x");
    Serial.println((uint16_t)__brkval, HEX);
  }
  Serial.println();
}

void setup() {
  Serial.begin(9600);
  
  Serial.print(".data section starts at 0x");
  Serial.println((uint16_t)&__data_start, HEX);
  Serial.print(".bss section starts at 0x");
  Serial.println((uint16_t)&__bss_start, HEX);
  Serial.print("Together, they take up ");
  Serial.print((uint16_t)&__bss_end - (uint16_t)&__data_start);
  Serial.println(" bytes of memory\n");

  show_allocator();
  Serial.println("allocate 4 bytes");
  uint32_t* ptr = malloc(4);
  Serial.print("allocated block address = 0x");
  Serial.println((uint16_t)ptr, HEX);
  show_allocator();

  Serial.println("allocate 8 bytes");
  uint16_t* ptr2 = malloc(8);
  Serial.print("allocated block address = 0x");
  Serial.println((uint16_t)ptr2, HEX);
  show_allocator();

  Serial.println("free the first block (4 bytes)");
  free(ptr);
  show_allocator();

  Serial.println("free the second block (8 bytes)");
  free(ptr2);
  show_allocator();
}

void loop() {
  // put your main code here, to run repeatedly:

}

And the output:

.data section starts at 0x100
.bss section starts at 0x280
Together, they take up 554 bytes of memory

There is no freelist.
No memory has been allocated yet
The heap starts at 0x32A

allocate 4 bytes
allocated block address = 0x32C
There is no freelist.
The current end of heap is at 0x330

allocate 8 bytes
allocated block address = 0x332
There is no freelist.
The current end of heap is at 0x33A

free the first block (4 bytes)
The freelist starts at 0x32A
The current end of heap is at 0x33A

free the second block (8 bytes)
There is no freelist.
The current end of heap is at 0x32A

What happens, when you free the second block first?

1 Like

I've made a good bit of progress since starting this thread. I now have a modified MemoryFree library (I'll post tomorrow - I have to go out right now) that gives a lot more information - current stack size, current heap size, bss size, total free size (heap + free blocks). And it provides a simple means of tracking how much stack, heap, etc. is used, or freed, between any two points in the code. It is giving me what appear to be correct numbers, that match up perfectly with what the compiler outputs, and what I expect, given what the code is doing.

Regards,
Ray L.

Here is my modified version of MemoryFree. It is a work in progress, so is a bit rough. It is also currently hard-wired for the AVR328P memory map. I’ll fix that sometime.

I’ve added a number of new functions and macros. The ones you really need to be aware of are:

initFreeMemory()

Sets up some internal data items, based on the target processor, and prints some statistics. Typically, this should be the first call after your Serial.begin call, so you have a “starting point” for memory state.

showMemory(const char *s)
showMemory(const __FlashStringHelper *s)

These provide more extensive memory reports. I actually find these to be of little use, now that I have the below macros.

SHOW_SIZES

Shows a “brief” report of free RAM, used RAM, and stack and heap sizes

SET_MARK

Saves the current stack and heap sizes, for later use by SHOW_MARK. SET_MARK is effectively invoked on the call to init, so the initial “mark” is where init() is called.

SHOW_MARK

Shows the current stack and heap sizes as a delta value from the most recent SET_MARK. This is useful if, for example, you want to see how much RAM creating an object, or group of objects, consumes. Invoke SET_MARK immediately before creating the objects, and SHOW_MARK immediately after creating them. It will show you exactly how much stack and heap was either consumed or released between the SET_MARK and SHOW_MARK.

All of the above macros also display the file and line number at which they are invoked.

Here is an example of the output:

Starting...

bss_size = 550

stack_top = 2047
stack_bottom = 2043
stack_size = 4

heap start = 550
heap end = 550
heap size = 0

free_size = 1497
used_size = 550

E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:21: Mark set
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:23: free:1486 used:561 (11/0)
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:8: free:1415 used:632 (46/36)
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:8: free:1422 used:625 (39/36)
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:8: free:1429 used:618 (32/36)
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:8: free:1436 used:611 (25/36)
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:8: free:1443 used:604 (18/36)
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:8: free:1450 used:597 (11/36)
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:27: Mark stack: 0/heap: 36
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:28: Mark set
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:30: free:1450 used:597 (11/36)
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:32: Mark stack: 0/heap: 0
Test0
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:10: free:1426 used:621 (35/36)
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:10: Mark stack: 20/heap: 0
Test1
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:10: free:1430 used:617 (31/36)
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:10: Mark stack: 16/heap: 0
Test2
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:10: free:1434 used:613 (27/36)
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:10: Mark stack: 12/heap: 0
Test3
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:10: free:1438 used:609 (23/36)
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:10: Mark stack: 8/heap: 0
Test4
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:10: free:1442 used:605 (19/36)
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:10: Mark stack: 4/heap: 0
Test5
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:10: free:1446 used:601 (15/36)
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:10: Mark stack: 0/heap: 0
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:36: Mark stack: 0/heap: 0
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:40: Mark stack: 0/heap: -36
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:42: free:1486 used:561 (11/0)
E:\Users\RayL\Documents\Arduino\Build\sketch\Test_freeMemory.h:44: Mark stack: 0/heap: -36

The lines containing "free: xxx used: xxx (xx/xx) are the output of SHOW_SIZES. free is the free RAM, used is the total used RAM (both BSS and heap). The first number in parens if the current stack usage, and the second one is the total heap usage.

The lines with “Mark set” show where SET_MARK was invoked.

The lines with “Mark stack…” show where SHOW_MARK was invoked, and show the amount of stack and heap consumed or freed between that line and the most recent SET_MARK. A positive value indicates memory was freed, while a negative value indicates memory was consumed. SHOW_MARK can be invoked as many times as you like, from as many places as you like, after a SET_MARK.

If anyone knows of a way to do a compile-time “strip” of FILE down to just the file name, I’m all ears. From what I’ve read, it appears not possible, without modifying the core macro that defines it.

This is all brand-new, little tested, but so far has been working perfectly for me.

MemoryFree.h

// MemoryFree library based on code posted here:
// http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1213583720/15
// Extended by Matthew Murdoch to include walking of the free list.
// Further extended by Ray Livingston for additinal output options

#ifndef MEMORY_FREE_H
#define MEMORY_FREE_H

#include <Arduino.h>

#ifdef __cplusplus
extern "C" {
#endif


#define SHOW_SIZES  showSizes( F( __FILE__ ), __LINE__ );
#define SET_MARK    setMark( F( __FILE__ ), __LINE__ );
#define SHOW_MARK   showMark( F( __FILE__ ), __LINE__ );

void initFreeMemory();
int freeMemory();
void showMemory(const char *s);
void showMemory_P(const __FlashStringHelper *s);
void showSizes(const __FlashStringHelper *s, int line);
void setMark(const __FlashStringHelper *s, int line);
void showMark(const __FlashStringHelper *s, int line);
int freeListSize();

#ifdef  __cplusplus
}
#endif

#endif

Continued in next post…

Regards,
Ray L.

MemoryFree.cpp

#if (ARDUINO >= 100)
#include <Arduino.h>
#else
#include <WProgram.h>
#endif

extern unsigned int __heap_start;
extern void *__brkval;

// THESE VALUES HARD_WIRED TO NANO - NEEDS TO BE MADE DEVICE-AWARE!!
static const unsigned int RAM_start = 0x100;
static const unsigned int RAM_end   = 0x8FF;

/*
 * The free list structure as maintained by the
 * avr-libc memory allocation routines.
 */
struct __freelist
{
  size_t sz;
  struct __freelist *nx;
};

/* The head of the free list structure */
extern struct __freelist *__flp;

#include "MemoryFree.h"

static unsigned int max_ram = 0;
static unsigned int stack_top = 0;
static unsigned int stack_bottom = 0;
static unsigned int stack_size = 0;
static unsigned int heap_start = 0;
static unsigned int heap_end = 0;
static unsigned int heap_size = 0;
static unsigned int free_size = 0;

static unsigned int mark_stack = 0;
static unsigned int mark_heap = 0;

struct __freelist* current;
int total = 0;

void initFreeMemory()
{
 char *__stack = 0;

 max_ram = RAM_end - RAM_start;

 stack_top = max_ram;
 stack_bottom = ((unsigned int)&__stack - RAM_start) + 1;
 stack_size = stack_top - stack_bottom;
 mark_stack = stack_size;

 heap_start = (unsigned int)__malloc_heap_start - RAM_start;
 void *__heap = malloc(1);
 heap_end = (unsigned int)__heap - RAM_start;
 heap_end -= 2;
 free(__heap);
 heap_size = heap_end - heap_start;
 mark_heap = heap_size;

 free_size = (unsigned int)stack_top - (unsigned int)heap_end + freeListSize();

 Serial.print("bss_size = ");
 Serial.println(heap_start);

 Serial.print("\nstack_top = ");
 Serial.println(stack_top);
 Serial.print("stack_bottom = ");
 Serial.println(stack_bottom);
 Serial.print("stack_size = ");
 Serial.println(stack_size);

 Serial.print("\nheap start = ");
 Serial.println(heap_start);
 Serial.print("heap end = ");
 Serial.println(heap_end);
 Serial.print("heap size = ");
 Serial.println(heap_size);

 Serial.print("\nfree_size = ");
 Serial.println(free_size);
 Serial.print("used_size = ");
 Serial.println(max_ram - free_size);

 Serial.print("\n");
}


void setMark(const __FlashStringHelper *s, int line)
{
 char __stack = 0;

 Serial.print(s);
 Serial.print(":");
 Serial.print(line);
 Serial.println(": Mark set");

 mark_stack = ((unsigned int)&__stack - RAM_start) + 1;
 mark_stack = stack_top - mark_stack;

 mark_heap = 0;
 if(__brkval != 0)
 {
 mark_heap = (unsigned int)__brkval - RAM_start - heap_start;
 }
}


void showMark(const __FlashStringHelper *s, int line)
{
 char __stack = 0;

 Serial.print(s);
 Serial.print(":");
 Serial.print(line);
 Serial.print(": Mark");

 stack_bottom = ((unsigned int)&__stack - RAM_start) + 1;
 stack_size = stack_top - stack_bottom;

 if(__brkval != 0)
 {
 heap_end = (unsigned int)__brkval - RAM_start;
 }
 else
 {
 heap_end = heap_start - RAM_start;
 }
 heap_size = heap_end - heap_start;

 Serial.print(" stack: ");
 Serial.print((int)stack_size - (int)mark_stack);
 Serial.print("/heap: ");
 Serial.println((int)heap_size - (int)mark_heap);
}


void showSizes(const __FlashStringHelper *s, int line)
{
 char __stack = 0;

 Serial.print(s);
 Serial.print(":");
 Serial.print(line);
 Serial.print(":");

 stack_bottom = (unsigned int)&__stack - RAM_start + 1;
 stack_size = stack_top - stack_bottom;

 if(__brkval != 0)
 {
 heap_end = (unsigned int)__brkval - RAM_start;
 }
 else
 {
 heap_end = heap_start - RAM_start;
 }
 heap_size = heap_end - heap_start;
 free_size = stack_bottom - heap_end + freeListSize();

 Serial.print(" free:");
 Serial.print(free_size);
 Serial.print(" used:");
 Serial.print(max_ram - free_size);
 Serial.print(" (");
 Serial.print(stack_size);
 Serial.print("/");
 Serial.print(heap_size);
 Serial.print(")");
 Serial.print("\n");
}


int freeListSize()
{
  total = 0;

  for (current = __flp; current; current = current->nx)
  {
    total += 2; /* Add two bytes for the memory block's header  */
    total += (int) current->sz;
  }

  return total;
}


int freeMemory()
{
  int free_memory;
  if ((int)__brkval == 0)
  {
    free_memory = ((int)&free_memory) - ((int)&__heap_start);
  }
  else
  {
    free_memory = ((int)&free_memory) - ((int)__brkval);
    free_memory += freeListSize();
  }
  return free_memory;
}


void showMem(void)
{
 char __stack = 0;

 Serial.print("__brkval=");
 Serial.println((unsigned int)__brkval);
 Serial.print("__heap_start=");
 Serial.println((unsigned int)__heap_start);
 Serial.print("__malloc_heap_start=");
 Serial.println((unsigned int)__malloc_heap_start);
 Serial.print("__malloc_margin=");
 Serial.println((unsigned int)__malloc_margin);
 Serial.print("__flp=");
 Serial.println((unsigned int)__flp);
 Serial.print("__stack=");
 Serial.println((unsigned int)&__stack);
 Serial.print("stack size=");
 Serial.println(RAM_end - (unsigned int)&__stack);
 Serial.print("Heap size=");
 Serial.println((unsigned int)__brkval - RAM_start);
 
 total = 0;
 for (current = __flp; current; current = current->nx)
 {
 total += 2; /* Add two bytes for the memory block's header  */
 total += (int) current->sz;
 Serial.print("mblk: sz=");
 Serial.print((unsigned int)current->sz);
 Serial.print(" nx=");
 Serial.println((unsigned int)current->nx);
 Serial.print("Total: ");
 Serial.println(total);
 }
 Serial.println("\n");
}


void showMemory(const char *s)
{
 char __stack = 0;
 
 Serial.println(s);
 showMem();
}


void showMemory_P(const __FlashStringHelper *s)
{
 char __stack = 0;
 
 Serial.println(s);
 showMem();
}

Regards,
Ray L.

I am recommending to read this:

https://www.nongnu.org/avr-libc/user-manual/malloc.html

for anyone who has similar questions resp. problems. I hope it brings a little more light to the topic.

// THESE VALUES HARD_WIRED TO NANO - NEEDS TO BE MADE DEVICE-AWARE!!

RAMSTART and RAMEND macros are defined by AVR io.h for specific MCU. No hard coded values are needed.
Same with __data_start variable etc.

RAMSTART and RAMEND - those are the macros I was looking for, but was not able to find, not knowing the correct names. Thanks for the pointer!

Here is a corrected MemoryFree.cpp that is device-aware:

#if (ARDUINO >= 100)
#include <Arduino.h>
#else
#include <WProgram.h>
#endif

extern unsigned int __heap_start;
extern void *__brkval;

/*
 * The free list structure as maintained by the
 * avr-libc memory allocation routines.
 */
struct __freelist
{
  size_t sz;
  struct __freelist *nx;
};

/* The head of the free list structure */
extern struct __freelist *__flp;

#include "MemoryFree.h"

static unsigned int max_ram = 0;
static unsigned int stack_top = 0;
static unsigned int stack_bottom = 0;
static unsigned int stack_size = 0;
static unsigned int heap_start = 0;
static unsigned int heap_end = 0;
static unsigned int heap_size = 0;
static unsigned int free_size = 0;

static unsigned int mark_stack = 0;
static unsigned int mark_heap = 0;

struct __freelist* current;
int total = 0;

void initFreeMemory()
{
 char *__stack = 0;

 max_ram = RAMEND - RAMSTART + 1;

 stack_top = max_ram;
 stack_bottom = ((unsigned int)&__stack - RAMSTART) + 1;
 stack_size = stack_top - stack_bottom;
 mark_stack = stack_size;

 heap_start = (unsigned int)__malloc_heap_start - RAMSTART;
 void *__heap = malloc(1);
 heap_end = (unsigned int)__heap - RAMSTART;
 heap_end -= 2;
 free(__heap);
 heap_size = heap_end - heap_start;
 mark_heap = heap_size;

 free_size = (unsigned int)stack_top - (unsigned int)heap_end + freeListSize();

 Serial.print("bss_size = ");
 Serial.println(heap_start);

 Serial.print("\nstack_top = ");
 Serial.println(stack_top);
 Serial.print("stack_bottom = ");
 Serial.println(stack_bottom);
 Serial.print("stack_size = ");
 Serial.println(stack_size);

 Serial.print("\nheap start = ");
 Serial.println(heap_start);
 Serial.print("heap end = ");
 Serial.println(heap_end);
 Serial.print("heap size = ");
 Serial.println(heap_size);

 Serial.print("\nfree_size = ");
 Serial.println(free_size);
 Serial.print("used_size = ");
 Serial.println(max_ram - free_size);

 Serial.print("\n");
}


void setMark(const __FlashStringHelper *s, int line)
{
 char __stack = 0;

 Serial.print(s);
 Serial.print(":");
 Serial.print(line);
 Serial.println(": Mark set");

 mark_stack = ((unsigned int)&__stack - RAMSTART) + 1;
 mark_stack = stack_top - mark_stack;

 mark_heap = 0;
 if(__brkval != 0)
 {
 mark_heap = (unsigned int)__brkval - RAMSTART - heap_start;
 }
}


void showMark(const __FlashStringHelper *s, int line)
{
 char __stack = 0;

 Serial.print(s);
 Serial.print(":");
 Serial.print(line);
 Serial.print(": Mark");

 stack_bottom = ((unsigned int)&__stack - RAMSTART) + 1;
 stack_size = stack_top - stack_bottom;

 if(__brkval != 0)
 {
 heap_end = (unsigned int)__brkval - RAMSTART;
 }
 else
 {
 heap_end = heap_start - RAMSTART;
 }
 heap_size = heap_end - heap_start;

 Serial.print(" stack: ");
 Serial.print((int)stack_size - (int)mark_stack);
 Serial.print("/heap: ");
 Serial.println((int)heap_size - (int)mark_heap);
}


void showSizes(const __FlashStringHelper *s, int line)
{
 char __stack = 0;

 Serial.print(s);
 Serial.print(":");
 Serial.print(line);
 Serial.print(":");

 stack_bottom = (unsigned int)&__stack - RAMSTART + 1;
 stack_size = stack_top - stack_bottom;

 if(__brkval != 0)
 {
 heap_end = (unsigned int)__brkval - RAMSTART;
 }
 else
 {
 heap_end = heap_start - RAMSTART;
 }
 heap_size = heap_end - heap_start;
 free_size = stack_bottom - heap_end + freeListSize();

 Serial.print(" free:");
 Serial.print(free_size);
 Serial.print(" used:");
 Serial.print(max_ram - free_size);
 Serial.print(" (");
 Serial.print(stack_size);
 Serial.print("/");
 Serial.print(heap_size);
 Serial.print(")");
 Serial.print("\n");
}


int freeListSize()
{
  total = 0;

  for (current = __flp; current; current = current->nx)
  {
    total += 2; /* Add two bytes for the memory block's header  */
    total += (int) current->sz;
  }

  return total;
}


int freeMemory()
{
  int free_memory;
  if ((int)__brkval == 0)
  {
    free_memory = ((int)&free_memory) - ((int)&__heap_start);
  }
  else
  {
    free_memory = ((int)&free_memory) - ((int)__brkval);
    free_memory += freeListSize();
  }
  return free_memory;
}


void showMem(void)
{
 char __stack = 0;

 Serial.print("__brkval=");
 Serial.println((unsigned int)__brkval);
 Serial.print("__heap_start=");
 Serial.println((unsigned int)__heap_start);
 Serial.print("__malloc_heap_start=");
 Serial.println((unsigned int)__malloc_heap_start);
 Serial.print("__malloc_margin=");
 Serial.println((unsigned int)__malloc_margin);
 Serial.print("__flp=");
 Serial.println((unsigned int)__flp);
 Serial.print("__stack=");
 Serial.println((unsigned int)&__stack);
 Serial.print("stack size=");
 Serial.println(RAMEND - (unsigned int)&__stack);
 Serial.print("Heap size=");
 Serial.println((unsigned int)__brkval - RAMSTART);
 
 total = 0;
 for (current = __flp; current; current = current->nx)
 {
 total += 2; /* Add two bytes for the memory block's header  */
 total += (int) current->sz;
 Serial.print("mblk: sz=");
 Serial.print((unsigned int)current->sz);
 Serial.print(" nx=");
 Serial.println((unsigned int)current->nx);
 Serial.print("Total: ");
 Serial.println(total);
 }
 Serial.println("\n");
}


void showMemory(const char *s)
{
 char __stack = 0;
 
 Serial.println(s);
 showMem();
}


void showMemory_P(const __FlashStringHelper *s)
{
 char __stack = 0;
 
 Serial.println(s);
 showMem();
}

Regards,
Ray L.

As I wrote RAMSTART and RAMEND are defined via io.h (..\hardware\tools\avr\avr\include\avr), which has reference to appropriate ioMCU_type_name.h for specific type like iom328p.h. It defines everything specific for MCU like ports, timers...
and of course constants for memories. Here is 328P's snippet:

/* Constants */
#define SPM_PAGESIZE 128
#define RAMSTART     (0x100)
#define RAMEND       0x8FF     /* Last On-Chip SRAM Location */
#define XRAMSIZE     0
#define XRAMEND      RAMEND
#define E2END        0x3FF
#define E2PAGESIZE   4
#define FLASHEND     0x7FFF

The __brkval and __heap_start are defined in malloc.c and stdlib_private.h.