malloc(), realloc().. the dark side powerful it is

A simple Google search by "arduino malloc" will return all sorts of warnings. I understand that on a uC system where minimal resources are available it's hard/overkill to implement a proper memory management system. So memory fragmentation can (and will occur) if you start running multiple realloc() at every cycle.. for instance (pseudo-code):

uint8_t* p;
uin8_t p_size = 0;

for(;;) {
	if( Wire.available() ) {
		// Increase pointer memory
		uint8_t* np = (uint8_t *) realloc( p, sizeof(uint8_t) * (p_size + wire.available()) );
		if( np != NULL ) { p = np; }

		[ do random stuff with data at *p ]
		[ call realloc() to shrink *p after data has been processed ]
		[ free() what ever needs to be freed ]
	}
}

Run this for a while and SRAM sooner or later will become fragmented and the program will crash or just start doing stuff it isn't supposed to do. As far as I understood the problem is that malloc(), realloc() and such need to reserve a continuous block of memory and if there is none it can return a block already with data on it leading to memory corruption.

My question is what if we do not try to shrink memory ? Increase *p to store more data when needed but never decrease it, instead just clear (NULL or 0x00) the bytes that were processed, reusing the already allocated memory. Will still be bad ?.. If so why ?

fuh:
My question is what if we do not try to shrink memory ? Increase *p to store more data when needed but never decrease it, instead just clear (NULL or 0x00) the bytes that were processed, reusing the already allocated memory. Will still be bad ?.. If so why ?

How do you "increase *p"?

What was the point of clearing (NULL or 0x00) the "bytes that were processed"?

Haven't you noticed this part ?

// Increase pointer memory
uint8_t* np = (uint8_t *) realloc( p, sizeof(uint8_t) * (p_size + wire.available()) );
if( np != NULL ) { p = np; }

PeterH:
What was the point of clearing (NULL or 0x00) the "bytes that were processed"?

OK.. here I could have written "remove x bytes from p_size to reflect the data that was processed/removed/whatever on *p". But the general idea is that the allocated space of *p will be reused to store again data and never shrunk.

fuh:
My question is what if we do not try to shrink memory ? Increase *p to store more data when needed but never decrease it, instead just clear (NULL or 0x00) the bytes that were processed, reusing the already allocated memory. Will still be bad ?.. If so why ?

Well, because say you allocate 5 lots of 20 bytes, and then free the first one. The allocator doesn't try to "shrink" that per se. But it also can't make it larger because the first block will be adjacent to the second one. So to allocate (say) 30 bytes it has no choice but to allocate more to the end.

If you can allocate more memory while running, you need that space available. So rather than fragmenting your ram, just create a buffer as big as you need to begin with. then create/move/modify/destroy objects in that space.

  #define MAX_ITEMS 10

  //Raw memory allocation / deallocation.
  void *operator new( size_t size ){ return malloc( size ); }
  void operator delete( void * ptr ){ free( ptr ); return; }
  
  //Placement new/delete
  void *operator new( size_t size, void * ptr ){ return ptr; }
  void operator delete( void * ptr, void *ptr2 ){ return; }
  
  struct Obj{
    int i_Something;
    bool b_SomethingElse;
  }; 

void setup( void ){
  
  //Allocate data
  void *v_Data = operator new( ( size_t ) ( sizeof( Obj ) * MAX_ITEMS ) );

  //Represent allocation as an arraoy of Obj
  Obj *o_Array = ( Obj* ) v_Data;

  //Construct some objects in the allocation.
  new( o_Array++ ) Obj;
  new( o_Array++ ) Obj;
  new( o_Array ) Obj;
  
  //Reset array.
  o_Array = ( Obj* ) v_Data;

  //Do something with objects.
  if( o_Array[ 1 ].b_SomethingElse ) 
    ++o_Array[ 1 ].i_Something;
    
  //When done with objects
  ( o_Array++ )->~Obj();
  ( o_Array++ )->~Obj();
  ( o_Array )->~Obj();  

  //When done with allocation.
  operator delete( v_Data  );
  return;
}

void loop( void ){ return; }

To clarify what I meant (pseudo-code), is your explanation still valid ?

uint8_t* p;
uint8_t p_size = 0;

[ receive 5 bytes of data: uint8_t* data, uint8_t data_size ]

uint8_t* np = (uint8_t *) realloc( p, sizeof(uint8_t) * p_size );
if( np != NULL) { p = np; }

memcpy( p + p_size, data, data_size );
p_size += data_size;


// Removes 3 bytes from the beginning of *p
memmove( p, p + 3, p_size -3 );
p_size -= 3;

// Shrink
uint8_t* np = (uint8_t *) realloc( p, sizeof(uint8_t) * p_size );
if( np != NULL) { p = np; }

pYro_65:
If you can allocate more memory while running, you need that space available. So rather than fragmenting your ram, just create a buffer as big as you need to begin with. then create/move/modify/destroy objects in that space.

Powerful you have become, the dark side I sense in you. :slight_smile:
Is it a "easy way" to know the number of values stored at o_Array without doing accounting ?

Your lack of faith in the designers of memory allocation is ... disturbing.

The fact is that existing malloc systems are carefully designed to minimize fragmentation, re-use and combine freed block, and generally the best they can.

Your pseudo-code appears to show ideas that already inside the existing system probably are.

Google "malloc" or similar and a lot of work on the concept has been done, you will see.

I already did a lot of research.. I ended up reading Memory Areas and Using malloc which I quote:

A call to realloc() first determines whether the operation is about to grow or shrink the current allocation. When shrinking, the case is easy: the existing chunk is split, and the tail of the region that is no longer to be used is passed to the standard free() function for insertion into the freelist. Checks are first made whether the tail chunk is large enough to hold a chunk of its own at all, otherwise realloc() will simply do nothing, and return the original region.

When growing the region, it is first checked whether the existing allocation can be extended in-place. If so, this is done, and the original pointer is returned without copying any data contents. As a side-effect, this check will also record the size of the largest chunk on the freelist.

If the region cannot be extended in-place, but the old chunk is at the top of heap, and the above freelist walk did not reveal a large enough chunk on the freelist to satisfy the new request, an attempt is made to quickly extend this topmost chunk (and thus the heap), so no need arises to copy over the existing data. If there's no more space available in the heap (same check is done as in malloc()), the entire request will fail.

Otherwise, malloc() will be called with the new request size, the existing data will be copied over, and free() will be called on the old region.

The problem is that my real-world experience was not quite similar.. I found out that using grow/shrink procedures often resulted in random halts at runtime. And searching for similar problems a lot of people just say to stay away from dynamic memory allocation inside a uC system thing that I just can't understand. I mitigated my issue by not shrinking allocated memory and just procedural treat it has free with the help of accounting.

At the end of the day I was seeking for feedback from more experienced C/C++ programmers than I am.

What grow/shrink procedures? The ones in malloc or your own ones?

This is all getting a bit hypothetical for me. Are you saying there is a bug in the library, or a bug in your code? Or you aren't sure?

Instead of pseudo-code how about posting actual code that you say fails after a while?

As far as I can tell Arduino still uses avr-libc 1.6.4 (my arduino's libc.a is dated from 2008, and avr-libc-user-manual.pdf states 1.6.4), there is a documented bug related with malloc() and free() fixed at avr-libc 1.7.

OK, so the real point is that there is the bug in malloc. I seem to recall this being discussed before, and you are right, the library appears to be quite old.

The logical thing to do, it would seem, would be to fix the bug, then everything else should flow from that. I'm not sure how to rebuild libc.a, perhaps someone else can comment on that, or where a new version might be found?

Is it a "easy way" to know the number of values stored at o_Array without doing accounting ?

CurrentPointer - StartPointer ( not void pointers, but type cast pointers )

A common problem with dynamic memory allocation is fragmentation, for example when your application starts, the system has say 8K available....you then allocated 1K for your own uses, after your down you free that area of memory. Unfortunately you don't get back the 8K you start with, but two areas 7K and 1K, both freely available but you can no longer allocate 8K of memory as it isn't available as a contiguous block.

Over time the memory becomes very fragmented, you still have 8K free, but in lots of small blocks that are not allocatable in one large block. This was a common problem on PC's. Not sure about Arduino, haven't tried it.

Years agao I wrote a control system that was used 24/7 and initially it suffered from this problem where eventually it would fail as it ran out of memory. So I removed use of malloc, calloc and realloc and instead implement my own system.

This would allocate a large block of memory at application start-up then divide it up as required using my own versions of malloc and free. The advantage of this was that I could get back entire blocks of memory without the fragmentation.

If fragmentation is a problem on Arduino then it should be quite straight forward to implement this kind of memory management.

That's a question of updating the WinAVR installation, does arduino's core team have a good reason for not doing this ?

But I would say that the issues I was facing could be better explained by the following comment from SPlatten.

SPlatten:
A common problem with dynamic memory allocation is fragmentation, for example when your application starts, the system has say 8K available....you then allocated 1K for your own uses, after your down you free that area of memory. Unfortunately you don't get back the 8K you start with, but two areas 7K and 1K, both freely available but you can no longer allocate 8K of memory as it isn't available as a contiguous block.

SPlatten:
So I removed use of malloc, calloc and realloc and instead implement my own system.

This would allocate a large block of memory at application start-up then divide it up as required using my own versions of malloc and free. The advantage of this was that I could get back entire blocks of memory without the fragmentation.

But what's the reason behind the 8K block not the be freed as a continuous block, is it due to the malloc's internals ? So if it's possible to write a wrapper around malloc() to better handle memory, couldn't this be incorporated into avr-libc malloc() itself ?

I'm not sure why, I always thought it was something odd about Microsofts C compiler on the PC...but I've since encountered the same problem with gcc on Linux and also with C compilers on QNX.

fuh:
But what's the reason behind the 8K block not the be freed as a continuous block, is it due to the malloc's internals ? So if it's possible to write a wrapper around malloc() to better handle memory, couldn't this be incorporated into avr-libc malloc() itself ?

Well I don't know that he is right. Certainly fragmentation can occur depending on the order of allocating/freeing. But in the given example? Looking at the source for avr-libc 1.7.1 which I seem to have downloaded at some stage, I see these comments in free():

        /*
	 * Trivial case first: if there's no freelist yet, our entry
	 * will be the only one on it.  If this is the last entry, we
	 * can reduce __brkval instead.
	 */

So if you allocate 8K and then free it, you should be back to no freelist.

Then:

	/*
	 * Now, find the position where our new entry belongs onto the
	 * freelist.  Try to aggregate the chunk with adjacent chunks
	 * if possible.
	 */

Then it tries to merge the current freed block with earlier ones.

Nick, how could this be tested under the current arduino's avr-libc ?

Here:

Andy Brown wrote a library that debugs memory allocation in detail, by walking the free list.

Great link ! I've adapted the sketch to compile under arduino my results are:

allocating
0000 1584 1588 : used/free/large
0022 1562 1566 : used/free/large
0044 1540 1544 : used/free/large
0066 1518 1522 : used/free/large
0088 1496 1500 : used/free/large
0110 1474 1478 : used/free/large
0132 1452 1456 : used/free/large
0154 1430 1434 : used/free/large
0176 1408 1412 : used/free/large
0198 1386 1390 : used/free/large
0220 1364 1368 : used/free/large
0242 1342 1346 : used/free/large
0264 1320 1324 : used/free/large
0286 1298 1302 : used/free/large
0308 1276 1280 : used/free/large
0330 1254 1258 : used/free/large
0352 1232 1236 : used/free/large
0374 1210 1214 : used/free/large
0396 1188 1192 : used/free/large
0418 1166 1170 : used/free/large
0440 1144 1148 : used/free/large
freeing
0418 1166 1148 : used/free/large
0396 1188 1148 : used/free/large
0374 1210 1148 : used/free/large
0352 1232 1148 : used/free/large
0330 1254 1148 : used/free/large
0308 1276 1148 : used/free/large
0286 1298 1148 : used/free/large
0264 1320 1148 : used/free/large
0242 1342 1148 : used/free/large
0220 1364 1148 : used/free/large
0198 1386 1148 : used/free/large
0176 1408 1148 : used/free/large
0154 1430 1148 : used/free/large
0132 1452 1148 : used/free/large
0110 1474 1148 : used/free/large
0088 1496 1148 : used/free/large
0066 1518 1148 : used/free/large
0044 1540 1148 : used/free/large
0022 1562 1148 : used/free/large
0000 1584 1148 : used/free/large
allocating
0000 1584 1148 : used/free/large
0022 1562 1148 : used/free/large
0044 1540 1148 : used/free/large
0066 1518 1148 : used/free/large
0088 1496 1148 : used/free/large
0110 1474 1148 : used/free/large
0132 1452 1148 : used/free/large
0154 1430 1148 : used/free/large
0176 1408 1148 : used/free/large
0198 1386 1148 : used/free/large
0220 1364 1148 : used/free/large
0242 1342 1148 : used/free/large
0264 1320 1148 : used/free/large
0286 1298 1148 : used/free/large
0308 1276 1148 : used/free/large
0330 1254 1148 : used/free/large
0352 1232 1148 : used/free/large
0374 1210 1148 : used/free/large
0396 1188 1148 : used/free/large
0418 1166 1148 : used/free/large
0440 1144 1148 : used/free/large
freeing
0418 1166 1148 : used/free/large
0396 1188 1148 : used/free/large
0374 1210 1148 : used/free/large
0352 1232 1148 : used/free/large
0330 1254 1148 : used/free/large
0308 1276 1148 : used/free/large
0286 1298 1148 : used/free/large
0264 1320 1148 : used/free/large
0242 1342 1148 : used/free/large
0220 1364 1148 : used/free/large
0198 1386 1148 : used/free/large
0176 1408 1148 : used/free/large
0154 1430 1148 : used/free/large
0132 1452 1148 : used/free/large
0110 1474 1148 : used/free/large
0088 1496 1148 : used/free/large
0066 1518 1148 : used/free/large
0044 1540 1148 : used/free/large
0022 1562 1148 : used/free/large
0000 1584 1148 : used/free/large

Shouldn't I see the large block at the end be the same size as at the beginning ?