ESP32 fails to execute when using free() on a pointer with malloc

Hi there!

So I was writing a network connected program for air quality measurements using Arduino and at some point I had to allocate for a string a char*. As I have learned, I tried to free() and then set it to NULL, so I do not run out of dynamic memory or leave dangling pointers (although I am not sure I have to do the latter as the pointer itself is created in a function and is not static, but that is the lesser problem here, still answers are welcome regarding that as well).

I get the following error message:

CORRUPT HEAP: Bad tail at 0x3ffd7a7e. Expected 0xbaad5678 got 0xbaad5600

assert failed: multi_heap_free multi_heap_poisoning.c:253 (head != NULL)

This is the part of my code related to it:

        StaticJsonDocument<128> telemetry_msg;
        telemetry_msg["msgCount"] = telemetry_send_count;  
        serializeJson(telemetry_msg, serialized_telemetry_msg);

        // TODO: Refactor and extract to function.
        /* Telemetry message resizing magic. */
        heap_caps_check_integrity_all(true);
        uint8_t size = 0;

        while (serialized_telemetry_msg[size] != NULL_TERMINATOR)
        {
            size++;
        }

        char* optimized_telemetry_msg = (char*)malloc(sizeof(char) * size);

        for (uint8_t i = 0; i <= size; i++)
        {
            optimized_telemetry_msg[i] = serialized_telemetry_msg[i];
        }

        LogInfo("Optimized msg: " + String(optimized_telemetry_msg) + "\n");

        /* Telemetry message resizing magic. */
        
        result = esp_mqtt_client_publish(
            mqtt_client,
            telemetry_topic,
            optimized_telemetry_msg,
            size,
            CONFIG_MQTT_CLIENT_QOS,
            CONFIG_MQTT_CLIENT_MESSAGE_RETAIN_POLICY);

        // FIXME: CORRUPUT HEAP: assert failed: multi_heap_free multi_heap_poisoning.c:253 (head != NULL)
        free(optimized_telemetry_msg);
        optimized_telemetry_msg = NULL;

When I am not using a the free() call, this CORRUPT HEAP problem doesn't show up.
I am suspecting that the esp_mqtt_client_publish might be the problem here as the argument optimized_telemetry_msg is supposed to be a const char*... But I am not sure.

I am pretty sure that the problem is not that I run out of memory as if I do not call free, the program can run for a pretty long time, actually I do not even think that I have ran into any problems related to that...

Check whether malloc was successful.

Hi wildbill!

Thanks for the tip, I really should actually to be safe...

But the malloc is always successful. The char* optimized_telemetry_message does indeed contain the copied characters. (AFAIK, if malloc failed then the the pointer should remain NULL.)

How do you know? If malloc fails, you can still copy characters to the resulting NULL pointer. It would almost certainly be overwriting other variables of course.

1 Like

Your loop index goes up to size (inclusive). The highest valid index is size-1, not size.

I would strongly urge you not to use malloc, it has no place in C++ code.
In this case, use a std::string, String or std::vector<char> if you need dynamic character arrays. Although I don't see why, in the code you posted, there's no need to make a copy of serialized_telemetry_msg at all.

Similar discussion about why you should never use malloc in C++: How to create an array/list that contains strings, and I can append to? - #6 by PieterP

1 Like

One possible issue that can occur with malloc/free is that the ESP32 has 3 memory stacks for core0 and core1. Core0 has its own memory stack, core1 has its own memory stack and both cores have a shared memory stack. malloc is not aware of the memory stack division of the ESP32.

This is not correct, malloc and free are thread-safe (as required by the standard).

From the ESP-IDF docs:

Thread Safety

Heap functions are thread safe, meaning they can be called from different tasks simultaneously without any limitations.

From cppreference:

The following functions are required to be thread-safe:

Calls to these functions that allocate or deallocate a particular unit of storage occur in a single total order, and each such deallocation call happens-before the next allocation (if any) in this order.

(Although I'm quoting the malloc documentation, I would like to stress once more that it should not be used in C++.)

My source is the ESP32 documentation. API Reference - ESP32 - — ESP-IDF Programming Guide latest documentation

I'm not referring to thread safety of malloc(). I'm referring to actual hardware memory location. malloc() is only aware of a large pool of memory not that the memory is divided into 3 separate stacks. Well there are other stack as well like the RTC and ULP stacks but I'm not referring to those.

Exactly, that's the one I quoted in my previous post.

Which is of no concern to the user, the ESP-IDF's malloc implementation takes care of that internally. The different stacks are not relevant here.

Never, never, EVER call malloc() without CHECKING the return value! This is precisely what make malloc so dangerous! ALWAYS check if a null pointer is returned, and, if so, DO NOT attempt to use the (non-existant) new memory block, or you WILL be scribbling all over memory you do not own.

Proper use is:

    SomeType *p = null;
    ...
    p = (SomeType *)malloc(size);
    if (p)
    {
        // Do whatever using p here...
    }
    ...
    if (p)
    {
        free(p);
        p = null;
    }

Completely irrelevent. malloc() allocates space on the heap, NOT the stack. It neither knows nor cares about the stacks, and will use whatever stack is in use when it is called.

My bad I meant heap instead of stack.

I'll stick with using the ESP32's malloc macros.

The different heaps are not an issue either, malloc handles them transparently.

Sorry if I was unclear, I get the message in the IoT Hub telemetry, thus that is why I think that at least it does not fail, but of course I will update my code to check it. :smile:

Hi!
Thanks for the answer.
Some questions regarding the things you wrote:

  1. How is it that somehow if I use i<size in the loop when I log out the char* via Serial using the String() constructor I get {"msgCount":2}xV���, but the telemetry message is {\"msgCount\":2}, when not calling free()? (Using Azure IoT Hub telemetry.) If I call free() like this the telemetry message is {"msgCount":2}xV���?. In both cases indexing like this, the telemetry always arrives 2 times...

  2. As my 1. point stands, it seems like this indexing thing does solve my the HEAP CORRUPTION but introduces this extremely weird behavior. So why is it that if I index up to i <= size then the logging prints the correct message and the telemetry arrives correctly? If I understand correctly, than the problem is that I try to free up a memory location relatively indexed by i+size thus over-indexing by 1 char, meaning that when I try to free it up, the program tries to free memory not owned by optimized_telemetry_msg.

So kind of like tit for tat, changing the indexing solves the crash but introduces behaviour which is extremely weird.

Also, I have to pass char* and const char* to the functions provided by the JSON and mqtt libs so using vectors is kind of an extra here, no? I am not sure tho.

Because in that case you don't copy the null terminator, so the String constructor doesn't know where the string ends.

Because you're explicitly passing the size to the mqtt send function, so it does know where the data ends.

To fix this, you should use the String::String(const char *cstr, unsigned int length) constructor, not the String::String(const char *cstr) constructor:

LogInfo("Optimized msg: " + String(optimized_telemetry_msg, size) + "\n");

Not sure why that would be.

No, the problem is that you're going out of bounds on the memory returned by malloc.
This is Undefined Behavior, so it could do anything.
In this case, it seems that the ESP-IDF's malloc implementation places some critical housekeeping data at the end of the malloc'ed blocks, and if you overwrite that, malloc's internal datastructure gets corrupted, crashing your program.

Why is that an issue?

Why? Using a vector, you could replace most of your code by a single line, avoiding all of the issues with manual malloc/free calls and raw loops, making it much less error prone and easier to understand.

But like I said, you don't need to create a copy. Simply use the serialized_telemetry_msg directly:

StaticJsonDocument<128> telemetry_msg;
telemetry_msg["msgCount"] = telemetry_send_count;
String serialized_telemetry_msg;
serializeJson(telemetry_msg, serialized_telemetry_msg);
LogInfo("Serialized msg: " + serialized_telemetry_msg + "\n");
result = esp_mqtt_client_publish(
    mqtt_client,
    telemetry_topic,
    serialized_telemetry_msg.c_str(),
    serialized_telemetry_msg.length(),
    CONFIG_MQTT_CLIENT_QOS,
    CONFIG_MQTT_CLIENT_MESSAGE_RETAIN_POLICY);
1 Like

While the c++ spec considers the behavior undefined, it is, in a more practical sense, known. When you write to a location beyond the bounds of a dynamically allocated memory block, you scribble on memory you do not own. This can very easily not only corrupt memory belonging to some other variable(s), but can also VERY easily corrupt the linked list that is used to allocate and track ALL heap memory. THAT is the MOST likely cause of the CORRUPT HEAP message. Once that linked list is corrupted, there is NO recovery, other than a reset. Exceeding the bounds of a dynamically allocated block BY A SINGLE BYTE can be enough to cause this.

This makes it all clear now! Thanks.

Also, the last code you wrote, very neat and clean.

Ah, one more question. Is using the String class worse in performance than using plain char*?

Definitely not, after a couple of passes of the optimizer, all bets are off.

That's the whole point of undefined behavior: the standard defines the assumptions that an optimizer can make without altering the meaning of the code. If you invoke undefined behavior, those assumptions are violated, and the optimizer may (and often will) generate meaningless output.

Possibly. Probably not significantly enough to really care, especially compared to everything else you have going on in your code.

But if you do care, or if you have huge JSON objects, you can always measure. And compare

I just used a String for convenience and because you were already using one for logging.
If you wanted to use a fixed-size or pre-allocated char array, you could.

I just used a String for convenience and because you were already using one for logging.

Yes, but I do conditional compilation, that is actually macro I use, so it only gets translated into locations where I explicitly define that I am in debugging mode.

Fortunately I do not really have huge JSON objects, they comfortably fit into around a char[128] or so. I only asked bc I am running off battery and during wakeup periods I want to minimalize actual clock cycles where possible without going into assembly.

Still, if it is that insignificant, I stay with the more safe String.