New AVR malloc for 1.04/1.5.x will not work with schedulers and RTOSes

I see that a new malloc will be in future releases for AVR boards.

This malloc will not work with schedulers that allocate stack for threads. It assumes the stack pointer is near the end of SRAM:

	cp = STACK_POINTER() - __MALLOC_MARGIN__;
	if (cp <= __brkval)
	  /*
	   * Memory exhausted.
	   */
	  return 0;

Here is a task in ChibiOS with it's stack in a global memory array.

static msg_t Thread2(void *arg) {
  Serial.print("malloc returned: ");
  Serial.println((uint16_t)malloc(10));
  return 0;
}

Whith the current malloc it prints:

malloc returned: 1128

With the new malloc:

malloc returned: 0

I thought an operating system would supply its own version of malloc.

Couldn't you also adjust __malloc_heap_end? avr-libc: Memory Areas and Using malloc()

Yes, I see the problem.

The malloc.c used on Teensy, and soon to become part of Arduino 1.0.4 & 1.5.2, does not have __malloc_heap_end.

Is adding __malloc_heap_end enough to solve this problem? It's hard to know without seeing the code you're using.

Wait.. so the arduino team is providing their own fix to malloc.c rather than just using a newer version of the toolchain? I've been using avr-gcc 4.7.2 for a long time with no problems (and I get to use c++11 features)

Actually, they're using one I provided, which is more-or-less lifted from a newer version of the toolchain. However, I did make some slight modifications... one being the removal of __malloc_heap_end. Normally it's never used. Nobody has needed it for years on Teensy.... until now.

It's not a big deal to add this back in. Other changes probably aren't a big deal either. So please don't freak out. This isn't a huge problem worthy of hysteria and panic.

I've discovered some people get really weird when talking about malloc(), like it's some sort of mystical voodoo. It isn't. It's just ordinary C code. Bugs and missing features can be fixed. All that's needed is level-headed thinking and conversation, and little bit of coding work (which I'm more than happy to do).

Let's just talk about what's needed to solve the problem, ok? I'll modify the code, we can test it, and I'll make a Teensyduino release with it. When it's been tested a bit more, I'm sure the Arduino developers will accept a patch.

Why not just use a newer version of the toolchain? The reason I've heard against updating is that this one works fine... but obviously it doesn't.

WizenedEE:
Why not just use a newer version of the toolchain?

There are at least 2 very good reasons:

1: Many incompatible changes have been added in newer versions. Several important details, especially regarding PROGMEM, have changed. Programs written with the old compiler often fail to compile, because toolchain changes were made without backwards compatibility. A toolchain upgrade will cause a lot of pain, by breaking many libraries and sketches.

2: Somebody would need to go through the work to produce 4 binary packages. Before you point to availability of the newer toolchain from various sources, consider the situation with the old toolchain on Linux. For many months, perhaps even a year or more, the Arduino developers were looking for someone to merely package the old toolchain at a .tgz file. I finally did it (with some certain libraries added and scripting tricks used so it's a self-contained package that runs on all Linux distros). Nobody else even offered. It's not extremely difficult, but someone needs to do the work, yet every time this subject comes up, people who argue for a newer toolchain point to various distros and binary packages that aren't quite the same version (rather than doing the harder job of actually creating 4 identical packages that are known to be the same). It's easy to say "use a newer toolchain", but creating the 4 actual .zip/.tgz files to be used on all Arduino IDEs is actual work.

WizenedEE:
The reason I've heard against updating is that this one works fine... but obviously it doesn't.

One small bug suddenly becomes the toolchain obviously doesn't work!

Until now, or more specifically a few days ago, there were no known problems with the malloc() version I've been using with Teensy. So far, this is the first time anybody has missed the __malloc_heap_end feature (and possibly other stuff... still waiting for Bill to respond specifically on the RTOS needs before I work on the code).

This is really a very minor issue. It'll probably take less time to fix that I've spent just writing these messages here and on the PJRC forum.

A toolchain upgrade is a lot of work. Anyone who really wants a toolchain upgrade can approach the Arduino developers. Maybe they'll go for it? One way that will NOT work is just suggesting they do all the work. Actually creating the 4 packages, testing them carefully across all many version of all 4 platforms, and against every bundled library and perhaps a couple dozen of the widely used 3rd party libraries (and getting them fixed for the new version), and offering to continue supporting those 4 packages on all 4 platforms as incompatibilities arise (eg, Linux libraries change, Apple 10.9 adds new signature requirements, etc) would be much more compelling.

But that is a LOT of work. From what I've seen over the last several years, the odds of anyone stepping up to actually do that work are very slim.

However, fixing this specific bug is not very much work. I plan to do it.... assuming Bill responds with specific info, which I'm sure he will.

The requirement is mainly for malloc to work with other Arduino libraries. I have configured ChibiOS/RT so it never uses malloc/free.

ChibiOS continues the main thread with the stack in high memory. I like to be able to determine the high water mark for all stacks in embedded systems so I wrote a function to fill memory from the heap to near the stack pointer and a function to scan this area to find the high water mark. Therefore I need to know where the heap ends. Other Arduino tools analyze memory use so this feature is important.

With avr libc the expression for this location is:

  extern char *__brkval;
  extern char *__malloc_heap_start;
  char* p = __brkval ?  __brkval : __malloc_heap_start;

__brkval is zero until malloc() is called.
This sketch

void setup() {
  extern char *__brkval;
  extern char *__malloc_heap_start;
  Serial.begin(9600);
  Serial.println((uint16_t)__brkval);
  Serial.println((uint16_t)__malloc_heap_start);  
}
void loop() {}

prints the following with 1.03:

0
461

The Teensy malloc has no __malloc_heap_start so this sketch will cause an attempt to load Paul's malloc() and the libc malloc(). So you need conditional compilation for Teensy. It would be nice if the new malloc() worked like the official avr libc malloc().

For FreeRTOS, the main thread ends so I don't need a tool for stack high water mark. malloc() does need to allocate all memory up to RAMEND.

FreeRTOS allocates memory for tasks and has many choices so malloc() is not required to run FreeRTOS. I chose to use a wrapper for malloc() since Arduino libraries require malloc().

In short, I think __malloc_heap_end would be sufficient. There will be a little pain for various programs that compute memory use if you remove __malloc_heap_start and __malloc_margin which are documented in stdlib.h.

Variable Documentation
char* __malloc_heap_end

malloc() tunable.

char* __malloc_heap_start

malloc() tunable.

size_t __malloc_margin

malloc() tunable.

Edit: I don't think the functionality of __malloc_heap_start is too useful for Arduino with C++ since ctors may have already allocated memory before setup() is called.

This sketch:

String s = "ctor";
void setup() {
  extern char *__brkval;
  extern char *__malloc_heap_start;
  Serial.begin(9600);
  Serial.println((uint16_t)__brkval);
  Serial.println((uint16_t)__malloc_heap_start);  
}
void loop() {}

prints this:

481
474