Using for or while loops for embedded code

Correct, although it doesn't need to be a full-blown OS, just a queue of coroutine handles that are waiting to be resumed. I don't think you need preemption.

Here's an interesting Stack Overflow answer about coroutines on real-time embedded systems: c++ - Are stackless C++20 coroutines a problem? - Stack Overflow

1 Like

The macros implemented in any function register local variable for the state of that function in CO_BEGIN and co_delay(prd) exit that specific part of the function and when the function is called again, it resumes from that part. And when the function is finished then it should reaches CO_END which closes that entire coroutine.

Thanks for the link, seems interesting.

Is there a better way to dereference pointers to functions with a pointer to their arguments ?

Here I'm switching between the cases based on the number of arguments, and the of course dereferencing the same function pointer.

switch((tsk+thrd->tsk_cntr)->args_cnts){				// run task by args_cnts
	case 0:	// dereferencing 0 args
		((void(*)())(tsk+thrd->tsk_cntr)->tsk_fptr)();
	break;

	case 1: // dereferencing 1 args
		((void(*)(uint8_t*))(tsk+thrd->tsk_cntr)->tsk_fptr)(
		(uint8_t*)(tsk+thrd->tsk_cntr)->tsk_args+0);
	break;

	case 2: // dereferencing 2 args
		((void(*)(uint8_t*,uint8_t*))(tsk+thrd->tsk_cntr)->tsk_fptr)(
		(uint8_t*)(tsk+thrd->tsk_cntr)->tsk_args+0,
		(uint8_t*)(tsk+thrd->tsk_cntr)->tsk_args+1);
	break;
	
	case 3: // dereferencing 3 args
		((void(*)(uint8_t*,uint8_t*,uint8_t*))
		(tsk+thrd->tsk_cntr)->tsk_fptr)(
		(uint8_t*)(tsk+thrd->tsk_cntr)->tsk_args+0,
		(uint8_t*)(tsk+thrd->tsk_cntr)->tsk_args+1,
		(uint8_t*)(tsk+thrd->tsk_cntr)->tsk_args+2);
	break;

	case 4: // dereferencing 4 args
		((void(*)(uint8_t*,uint8_t*,uint8_t*,uint8_t*))
		(tsk+thrd->tsk_cntr)->tsk_fptr)(
		(uint8_t*)(tsk+thrd->tsk_cntr)->tsk_args+0,
		(uint8_t*)(tsk+thrd->tsk_cntr)->tsk_args+1,
		(uint8_t*)(tsk+thrd->tsk_cntr)->tsk_args+2,
		(uint8_t*)(tsk+thrd->tsk_cntr)->tsk_args+3);
	break;
}

I don't think there's a clean way to do this. The only way that allows any number (or type) of arguments would be using variable argument lists. You could probably store them behind a void pointer, e.g. Best Way to Store a va_list for Later Use in C/C++ - Stack Overflow, and then write a function to invoke it. This will hopefully be simpler than a switch with a case for each number/type of argument like in your current code.

I'm afraid C lacks the generics you need to pull this off, so I'd highly recommend doing this in C++, where you can implement this quite nicely.
At the risk of scaring you off (you need some less simple C++ features), here's a basic example.

Before we start, some terminology, I'll keep it brief but don't hesitate to ask for clarification:

  • Function template: function that can have different types of arguments, we only need it for, the next point:
  • Variadic function: a function with any number of arguments, indicated using ellipsis (...)
  • Closure: a function that captures and saves some variables from the scope around it (this is impossible in C)
  • Lambda function: fancy name for a simple thing, it's just an inline function definition, useful because lambdas can be closures. The syntax for a lambda function is [](arguments) { function body }.
    For example:
// This is a normal function that adds two integers:
int add_normal(int a, int b) {
  return a + b;
}

// This is a lambda function that adds two integers:
auto add_lambda = [](int a, int b) {
  return a + b;
};
// (auto is a data type, like int, but asks the compiler to deduce the 
// type because you don't want to write it out, or can't write it out,
// because a lambda is an anonymous function)

// This is a lambda function that captures the variable x by value,
// as indicated by the `[=]`, so it is a closure:
int x = 5;
int add_x = [=](int a) {
  return a + x;
};
// The function saves and remembers x for later, when you call it.
// Note that this is not possible using normal functions
// (unless x is a global variable).

Here's the implementation I came up with:

// Class that saves a function and its arguments, and allows calling the function with those
// saved arguments.
class FunctionWithArgs {
  public:
    // This constructor takes a function to call, and a variadic list of arguments
    template <class Fun, class... Args>
    FunctionWithArgs(Fun function, Args... args) { // Passing everything by value for simplicity
        // The following lambda function captures the function and all its arguments by value and
        // saves them. Calling the lambda calls the function with the saved arguments.
        auto closure_fun = [=]() { function(args...); };
        // Give the type of the closure a name for readability.
        using closure_t = decltype(closure_fun);
        // Dynamically allocate such a closure, type erase it, and store a pointer to it
        context = new closure_t{closure_fun};
        // This is a function that undoes the type erasure of the closure and then calls it
        vtable.caller = [](const void *ctx) {
            auto &closure = *reinterpret_cast<const closure_t *>(ctx);
            closure(); // call the function with its saved arguments
        };
        // This is a function that undoes the type erasure and then deallocates the closure
        vtable.deleter = [](void *ctx) {
            auto *closure = reinterpret_cast<closure_t *>(ctx);
            delete closure; // dealocate the function and its arguments
        };
    }

    // Disallow copying and moving for simplicity (or laziness?)
    FunctionWithArgs(const FunctionWithArgs &other) = delete;
    FunctionWithArgs(FunctionWithArgs &&other) = delete;
    FunctionWithArgs &operator=(const FunctionWithArgs &other) = delete;
    FunctionWithArgs &operator=(FunctionWithArgs &&other) = delete;

    // Destructor should deallocate the closure
    ~FunctionWithArgs() {
        if (context)
            vtable.deleter(context);
    }

    // Call operator should call the wrapped function with the saved arguments
    void operator()() const {
        if (context)
            vtable.caller(context);
    }

  private:
    struct { // These are just ordinary function pointers (like in C):
        void (*caller)(const void *) = nullptr;
        void (*deleter)(void *) = nullptr;
    } vtable;
    void *context = nullptr;
};

The constructor contains quite a bit of complexity, but it's all nicely abstracted away from the user and the rest of the code.

You would use it as follows:

void noisy_task(const char *task_name) {
    Serial.print(task_name);
    Serial.println(" is running ..."); // placeholder for actual tasks
}

void function_with_many_arguments(int i, float f, const char *s) {
    Serial.print("i: "); Serial.print(i); Serial.print(", f: ");
    Serial.print(f); Serial.print(", s: "); Serial.println(s);
}

void setup() {
    Serial.begin(115200);

    // Create an array of tasks, i.e. functions with their arguments that
    // have to be called later:
    FunctionWithArgs tasks[] {
        {noisy_task, "task A"},
        {noisy_task, "task B"},
        {noisy_task, "task C"},
        {function_with_many_arguments, 42, 3.14, "abc"},
    };

    // Execute all tasks in the array:
    for (auto &task : tasks) {
        task(); // execute the task (without any void * ugliness)
    }
}

void loop() {}
task A is running ...
task B is running ...
task C is running ...
i: 42, f: 3.14, s: abc

If anything is unclear, or if you don't understand the syntax, just let me know!

You'll need a C++20-capable compiler like GCC 11. Unfortunately, Arduino is stuck on C++11, so you would have to replace the compiler yourself to get it to work.

For now, I would start experimenting on your computer, not on the Arduino. That'll be easier to learn, debug and play with the new features.

Either way, I think going the standard C++ coroutines route is a much better idea than trying to implement it manually, which involves ugly macros and will always have rough edges, because C and C++11 simply do not allow you to store local variables for later resumption.
C++20 automagically does that for you, which is a huge help.
Note that the language itself just provides the bare necessities, the actual user-friendly API is provided by libraries such as GitHub - lewissbaker/cppcoro: A library of C++ coroutine abstractions for the coroutines TS.

Coroutines are a very old idea, but a relatively new feature in C++, I haven't found beginner-level tutorials, but there are many blog posts that are quite good, although pretty advanced and technical.

What features are you looking for exactly? I might be able to recommend a different technique.

1 Like

This whole endeavor seems like an enormous amount of unnecessary work and wheel-reinventing. Why not just get an ESP32 that comes with FreeRTOS - a real multi-tasking OS? Then you can start working not only on multi-task but multi-core applications.

1 Like

I have several ESP32 boards that I haven't start working on yet, and I'm really excited to do some projects with them.

I tried to work on FreeRTOS but I didn't like it, I want to do something much simpler myself.
I want to improve my understanding of concepts; like, thread queue, context switching ... etc. I want to develop these programming facilities myself so I have a deep understanding of the code I'm working with.

I'm a teacher and I have to understand many things in developing code. Because I've been in an embarrassing situation where I stood up knowing almost nothing, so after that I decided to understand whatever I can learn.

Thank you so much for this answer ! I really have to study it thoroughly.

Then connect ideas. I also want to do this project in C and C++. It should be a good practice for me to get grasp on some important programming skills which I really need right now. Then I might do some module drivers. After that I want to post an example project with required libraries and get members' feedback.

But it's not that easy, I usually work relatively slow and get much time to understand something :slight_smile:

why?

right now?

it sounds like you're eager to understand and find a use for a multi-tasking application rather than focusing on good programming techniques

when you're the sole developer of an application, you might have several function calls within some loop, where each function completes execution within a reasonable amount of time.

for processors such as an arduino, the expense in lost performance from an OS is unnecessary

the last few projects in optics and software radio used vxworks and embedded linux and required a different mindset that was more message based.

i'm trying to imagine the use case for a "coroutine" that has one or more "yield" points within it rather than just completing all processing within a "reasonable" amount of time and returning to the calling function (e.g. loop())

there may be good reasons or particular applications to use "coroutines", but they don't seem very common, especially on arduino.

Stroustrup warned developers not to use C++ features unnecessarily. in my experiences, when a need pops up, you figure out how to use the feature and get back to developing code. the program is not driven by that feature.

reminds me of the joke from college where someone brags about writing a 15k line program. the retort is they couldn't do it in 5k lines.

I'd argue the opposite. Short tasks that happen in sequence are really common on microcontrollers. Consider:

task<void> my_coroutine() {
  // (hypothetical syntax, Serial doesn't currently support this) 
  String msg = co_await Serial.readString();
  if (msg == "read adc") {
    startADCConversion();
    int adc_value = co_await readADCResult();
    // do some calculations
    do {
      sendReponse(adc_value);
    } while (!co_await responseAcknowledged());
  }
}

You could of course come up with some C code with the same behavior, but this would require you to implement some kind of state machine yourself, having to store local data explicitly, manually causing state transitions, and the code for one conceptual task being scattered in different functions or parts of the state machine.
Coroutines make this much more concise and easier to read, you can just follow the flow of the program.

Another advantage is that this code doesn't block: while waiting for serial data, or while the ADC is doing the conversion, the CPU carries on with other tasks. Implementing this in plain C would require some interesting acrobatics.

While I agree with this quote, I think you have to be careful not to use it as an excuse not to try anything beyond ordinary imperative programming like C.
While some features are easy to overuse (e.g. people learning OOP using inheritance for every relationship), the majority of features in modern programming languages have very good use cases, so dismissing a certain feature you're unfamiliar with, just to keep the number of used features down is a bad idea as well.

The advantage of languages like C++ is that they are well-suited for designing libraries that abstract away the complicated features. For example, if I just handed you the FunctionWithArgs class above as a library, you wouldn't have to worry at all that there's a lambda function being used in the constructor, it doesn't add any complexity to the usage of the class, quite the contrary, storing a FunctionWithArgs results in much simpler code than passing around function pointers and void pointers to unknown arguments.

When it comes to the “simplicity” of code, I think a large part of it comes down to the complexity and readability of the code. Certain features can significantly reduce the complexity, and should be preferred. The added complexity by using a new feature can be outweighed by the fact that the code becomes much less convoluted.

As an example: should every function be a function template? Obviously not, but if you need a dynamic array container that can hold any type of elements, then templates are definitely the right choice, and using a void * to an array of unknown type would be a bad idea. It doesn't matter that the “feature count” goes up by one if it's the right feature to use. And sometimes your opinion on which is the “right feature” changes over time, as you become more familiar with more features, but just dismissing any new or unfamiliar features on the basis of “not using features unnecessarily” is not a good idea in my opinion.

1 Like

I want to test things out using if statement, because it's conditional and non-blocking function.

I've done some projects for the college, because I'm a trainer. Mostly now teaching fundamentals of microcontrollers with PIC chips.

But I supervised projects course and I haven't had much experience in supervising groups of students in their final graduation projects course. I got stuck multiple times in some programming stages + students have no much idea of programming. So I had to program everything myself.

So when testing projects, I can notice that the code isn't very responsive and there are some delays because a function is taking the whole time.

For example, if I want to upload an image on a tft display and there are some important sensors and there are important data have to be collected in the middle of the time the arduino is uploading the image to the tft, so the system here isn't very reliable.

Even using much faster board; like, the maple mini or ESP32 which should upload the image to the tft more faster, but still, if the image would take 500ms for a maple mini or ESP32 to upload and there are important sensors that are working. I prefer in this case not to be blocked the whole 500ms, and rather upload the pixels with processing other functions. That's my whole point of coroutine and multitasking.

So using a while or for loop to upload an image to a tft should rank the code to be not so responsive and therefore not very reliable.

thanks both of you for trying to explain.

ok. i see that it tries to make better use of the time needed to perform an ADC capture. when real-time performance is critical, we've just triggered a capture after the read so that it's ready the next cycle. (real-time systems make this possible)

i gotta wonder if the capture doesn't take much more time that the multiple context switches.

it's not a question of quantity, but of necessity (or unnecessary)

i think C++'s growing number of features is better suited for large applications (e.g. CAD, GUIs). i think it's difficult to explain/understand it's proper use in real-time embedded applications, especially tiny ones such as arduino.

sounds like my college professor who taught microprocessors ('81) because he knew nothing about them.

i'm of the opinion that it's good to first let a novice try writing a program themselves. Someone told me seeing a program is like seeing into someone's mind. I think showing them a better way to do things afterwards is a better learning experience

in Elements of Programming Style the authors critique textbook(!) examples of programs, describing the flaws, how to correct them and how to improve the programs. at the very least it shows how to do things in different ways and the pro/cons. C++ Programming Style is similar

there are interrupts (and non-interruptible critical sections)

maybe not sufficiently responsive, but certainly reliable. just doesn't meet requirements. (proper perspective helps)

How do you judge necessity?
To give an extreme example, you don't need functions or variables, you can just write one large block of code with jumps and use all of memory as a big scratch space of bytes.
While unnecessary, these features greatly improve code maintainability.

Similarly, you don't need template functions, you can often achieve similar results using “generic” functions that take their arguments by void * or unions and a manual type tag. In other cases, you might create multiple ordinary functions, e.g. abs, fabs, fabsf, fabsl, etc. where the user has to call the right one depending on the argument type.
However, templates result in more general, more efficient, and less error prone code.

Please correct me if I'm constructing a strawman here, but going on the previous contexts where you brought up the same Stroustrup quote, I think you might see casting back and forth to void * as a base feature that's given from the beginning, while templates are an “extra” feature that you have to justify adding to a project.
I can understand where you're coming from, given your decades of C experience, but I'd argue that using void * over templates would require justification as well, and I believe that it's harder to justify in most scenarios, for the reasons I mentioned above.

If you're used to working with a hammer for long enough, everything starts to look like a nail, although some problems are best solved with a screwdriver.

It depends, I'd argue that features like deterministic destruction, compile-time code execution and templates are a godsend to optimize embedded code. You can avoid dynamic allocations by making array sizes a template argument, you can avoid run-time polymorphism using templates, you can easily build lookup tables at compile time, you can check pin numbers and capabilities at compile time, you can prevent memory or other resource leaks, etc.

Many features that make C++ suited for maintaining large projects and libraries also help for embedded applications.

I agree that too much abstraction makes code hard to reason about, especially in embedded contexts where you sometimes have to go low-level, but just enough abstraction prevents bugs, makes code more readable and more maintainable.

The main problem I have with C is that it mixes high-level code with low-level code: when I'm reasoning about the logic of my web server, I don't want to have to worry about what pointers should be free'd in each code path, I just want to focus on what web page is served to the user.

Many abstractions in C++ help with this separation of high-level and low-level code. Now that we're citing coding advice, one of the practices taught by Robert C. Martin is to have one level of abstraction per function. I don't believe this is possible in C: you often have one line that deals with the high-level logic of an application, and then the next line has to deal with making sure the data structure from the previous line is deallocated correctly. This adds unnecessary cognitive load.

This is not possible if you have to switch the ADC channel before reading, e.g. if the message contains the pin to read from.

Suspending coroutines is rather cheap, and ADC conversions often take tens or even hundreds of microseconds. A modern ARM microcontroller executes tens of thousands of instructions in that time.
You don't need a full context switch, no OS, no syscalls, usually no synchronization, etc. coroutines are really light-weight.

Don't get too caught up in the details, it was just an example, the point is that coroutines are faster than launching threads and make asynchronous code much easier to read and write.
They are used extensively in modern languages such as Python and JavaScript, where you don't want to block your entire web server while waiting for a file to be read from disk.

There are similar examples in many microcontroller applications, e.g. waiting for an SPI transfer to a TFT display. You can of course do most of this using a custom state machine and your own interrupt handlers, but coroutines can make this pretty painless.

Coroutines have been in use since the late 1950's, in much more constrained environments than the microcontrollers of today. I'm not saying that all embedded programs should be converted to use coroutines, but they definitely have their use cases, such as replacing state machines and simplifying asynchronous tasks.

again, i appreciate you thoughts (they help)

in general, why use a feature that solves a more specific problem when a conventional approach can be used (i.e. KISS, but no simpler -- that guy)

one consideration is using features familiar to your reviewers. i understand expanding the envelope, but now the reviewer needs to come up to speed

perhaps another example is creating a class for which there is just a single instance. we often just created a file containing a few interface functions with several statically defined variables and helper functions which encapsulated the design compactly.

when i started at bell labs in 1985, there was lots of discussion about c++. i went to a talk, where within the first 3 sentences the speaker said you can do OOD/OOP in just about any language, including assembler, not Basic or Fortran.

i know i'm biased, because i've never seen a well written piece of C++ code, but lots of poorly written c++ code (see Cargill). and i am convinced there is a right way to used c++ in higher performance embedded applications

i agree that C++ is well suited for dynamic applications such as web servers. i believe it is the right tool for that job. Kernighan said C doesn't have the "guard rails" needed for much larger applications provided by C++.

for many years, i worked on relatively small but real-time applications. i had studies operating systems (unix, xinu, minix) but never had a chance to work with one until i worked on an optical maintenance communication feature. it required debugging the interface between a vxworks application and a network processor that required judiciously adding semaphores.

i mention this because a good understanding of OS features and real-time fimware techniques does not depend on the language.

i don't believe c++ was well suited when i worked on ethernet drivers under vxworks or android, nor for the low level radio hardware i developed along with the real-time coordination with external DSPs.

c++ is better suited and was used by the developers of the higher level 4G LTE stack. i'm sure many of the classes use in the stack were reused considerably to support the many connections, similar to a web server.


but this is an arduino forum.

i think the trick is educating people who have aspirations for working on autonomous vehicles to learn good programming skills using OOD that support the use of c++ in more sophisticated embedded applications

with respect to the OPs goals, i think it's hard to demonstrate the use of techniques well suited for more sophisticated applications such as web servers on an arduino (maybe a network of arduinos)

1 Like

I think this is mainly a difference in background. I would expect reviewers of C++ code to be familiar with templates, for example, but I wouldn't expect that of someone who's working in C.
C is more of an exception in that regard, almost all modern languages (Java, Rust, Go, ...) support template-like generics, and they are used quite commonly. For example, you cannot create a List in Java without generics, so they are taught early on, and don't cause issues among beginners, at least that was my experience back in my first Java class. Generic types and functions were introduced as a natural extension to normal types/functions.

Coroutines are rather new, so I wouldn't expect many people to be well-versed in them. On the other hand, if they are suitable for a certain application, I would expect reviewers to make an effort learn how to use them (or at least learn about them enough so we can have an informed discussion about whether they are applicable for a certain job).

This forum is a completely different story, of course, I do realize that many posters on here don't have the necessary background to understand things like that fully, many even struggle with simple functions.
That being said, I think exposure to new concepts is a good thing, especially if said concepts solve the problem more elegantly.

While that's of course true, certain languages are more suitable than others. Having the compiler generate the vtables etc. for you removes a lot of the boilerplate. You can of course do the same in C, but you lose the convenience and the correctness you get when the compiler generates it for you. It often comes down to choosing the right tool for the job.

On the topic of OOD/OOP, I think it's good to make a clear distinction between the different paradigms. While C++ might have evolved from "C with classes", and although it does of course support inheritance and virtual functions, it is a multi-paradigm language: coroutines, templates, etc. have nothing to do with OOP per se, and most of the standard library follows a functional or generic programming style.

I agree, the computer architecture and operating systems courses I took were taught in C and assembly, which I think is the right choice, because you want to learn how things work under the hood.

This I find interesting, what could you do in C that you could not do in C++?

I first started programming in C, but quickly moved to C++ for larger projects, and (with some care), most of the files could just be renamed from .c to .cpp. Many of the low-level IO functions etc. remained C-like code, but the higher-level logic in the main program gradually became more C++-ish, which improved both the performance and readability (and also unearthed some latent bugs in the original code).

I think this is one of the strengths C++ had (has): if you have a C code base where you think you could benefit from some C++ feature in a certain piece of code, you can just rename that one file from .c to .cpp without changing any other code, and use that one feature to improve the code.

That's true, but given that OP is a teacher who might teach future autonomous vehicle engineers, and given that Arduino is kind of a "gateway drug" for people to get into electrical engineering, I think it's good to promote good practices on this forum as well.

In my experience, this really helps prevent bugs, e.g. by using range-based for loops instead of manual indices, template arguments for array sizes instead of manually passing the (wrong) size, using C++ containers instead of raw calls to malloc to prevent memory leaks. These are all relatively simple things that make it harder to make mistakes, this is useful regardless of whether it's a beginner struggling with his LED flashing program crashing, or an experienced developer working on autonomous vehicles.

Things like coroutines are quite new, and probably overkill for the large majority of Arduino projects, but it might still be useful for a handful of people who come across threads like this, since there are problems where coroutines really are the best solution.

I'm not so sure, the web server was just an example, it's more widely applicable than that. And even if you just look at web servers, the ESP8266 and ESP32 boards are quite popular, having to deal with blocking network sockets is a real problem there. While there are some asynchronous libraries with callbacks available, this leads to spaghetti code and so-called callback hell, so I think this is an area where coroutines might really improve code quality.

Pieter
i think it would help the discussion if you described your background
... and when you went to university

BTW, i'm an EE, not CS

i would be curious to see a well written ethernet driver using c++ features

all our files at qualcom were .cpp and i only recently realized that some things i took for granted were actually unique to c++.

i think over 20+ years ago, embedded systems magazine forecasted that C was a dying language. here are two recent comparisons: one for languages in general and then for embedded applications

thanks

perspectives

I'm not advocating for using C++ features for the sake of it, for many purposes, normal functions work just fine.

No, I don't see C going away any time soon either.

understood. but i think seeing it done using C++ features would clarify this discussion immensely. (is it just OOD)?