Does Arduino need a real-time scheduler?

So what is your opinion about Arduino multitasking and schedulers?

As a pure hobbyist with mostly a hardware background I have no idea if any of my typical or future projects would benefit from having access to a multitasking and scheduler environment to work inside of.

Most of the beginners questions about needing or wanting multitasking seems to me to be more just a mental block on their part because of not understanding basic C/C++ program structure ability and that one just normally needs to address avoiding blocking functions so that their main loop function can handle all the independent tasks they wish to accomplish in their sketch in a timely matter. Then add pretty simple to understand user and pinchange interrupt capabilities and I just haven't seen the need to have or learn a more complex environment to work with.

So I guess I would certainly be interested from a learning perspective but I suspect if it gets too complex to understand or actually implement, I would probably never get around to actually trying it, not unlike the gazillion arduino libraries I've downloaded, looked at quickly and said "nope, it's over my head". :wink:

Lefty

retrolefty,

I suspect you're in the majority and few will use true RTOS features.

Reality is a good thing, I often expect Arduino users to do what young students do here in introductory embedded systems courses. Edward A. Lee recently modernized the introductory EECS 149 course.

Here are some example projects. This video starts with a face tracking project. A camera on a quad-copter tracks the student's face, stays at the same level about four feet away.

The lectures for this course are on youtube null - YouTube. Slides are here http://chess.eecs.berkeley.edu/eecs149/lectures/index.html

The book for this course is somewhat theoretical but is a good reference and free http://leeseshia.org/releases/LeeSeshia_DigitalV1_07.pdf.

Here is another version student projects videos:

So what is the requirement for an Arduino scheduler? Only Arduino users can answer this.

I doubt that. I would guess that at least 80% of professional software engineers working in embedded systems would do a poor job in specifying "requirements for a scheduler" for their platform. 50% probably confuse "real time" with "fast." Arduino users are nearly by definition less aware, and aren't interested in being aware of those issues.

Their answer is probably "I want to be able to do several things at one time." That's probably what their answer should be. That's the way most desktop users think of things as well.
If you can get some quantitative answers for the value of "several", and the limitations (timewise) of "at once", you'd be doing really well. Consider:

task foo1 {at noon digitalWrite(1, HIGH);}
task foo2 {at noon digitalWrite(2, HIGH);}
  :
task fooN {at noon digitalWrite(N, HIGH);}

What's an appropriate value for N, and how many milliseconds after noon do all the N pins need to be high?
Is it more important for that number of milliseconds to be small, or to always be the same?
THIS kind of question you might be able to get answers for.

a preemptive RTOS will only require between 1 and 4% of CPU time in exchange for valuable services.

will required, or could require? I find that number hard to believe, depending on what it is measuring. Although... CPU time is probably not the bottleneck resource, so it probably doesn't matter.

westfw,

Their answer is probably "I want to be able to do several things at one time." That's probably what their answer should be. That's the way most desktop users think of things as well.
If you can get some quantitative answers for the value of "several", and the limitations (timewise) of "at once", you'd be doing really well. Consider:

Your almost certainly right.

I have experience with two types of users, research physicists, and EECS students from UC Berkeley. Both groups are comfortable using a RTOS. Older EEs not so much.

I am a PhD physicist and I have done architecture and design of control systems for many large experiments. We started using RTOSs about 40 years ago.

I did some work on LHC, the big experiment at CERN looking for the Higgs Boson. CERN has used LynxOS in control systems for over twenty years.

Some of my colleagues left the lab to develop VxWorks. NASA JPL uses VxWorks in all Mars rovers.

I find it hard to understand the resistance to use of RTOSs.

Maybe a tutorial with more realistic example application would help Arduino users understand the value of RTOSs. Cortex M clearly was designed for use of an RTOS.

On the other hand, unless you understand things like reading an ADC at relative slow rates, like 100 Hz, requires a time jitter on the order of one microsecond if you want low SNR in the signal. The SNR of an ideal 10-bit ADC is about 62 dB. At 100 Hz, 4 microseconds of jitter in the reading time reduces the SNR to about 52 dB. A coop scheduler just won't schedule a thread with low jitter and reading a sensor in an OS thread is easier than setting up a timer driven ISR.

will required, or could require? I find that number hard to believe, depending on what it is measuring. Although... CPU time is probably not the bottleneck resource, so it probably doesn't matter.

The measure generally means CPU time. A RTOS has no extra overhead unless you call a OS function or an event causes a context switch. You don't do a context switch for every interrupt and many fast interrupts can be handled just like the bare metal approach without any OS overhead. For example a serial driver puts bytes in a queue just like the Arduino drivers.

On a chip like a 72 MHz STM32, a context switch with ChibiOS costs just over one microsecond. A well designed application should not have more than a few thousand context switches per second so the overhead will be a few percent.

A preemptive RTOS responds to an important high priority event with the handler thread running in one microsecond but with a coop scheduler, who knows.

I find it hard to understand the resistance to use of RTOSs.

That is not about the "resistance" but the need. Most arduino users do not work for LHC or NASA and their designs work just fine with a superloop :wink:
With more demanding applications they will certainly consider an rtos..

That is not about the "resistance" but the need. Most arduino users do not work for LHC or NASA and their designs work just fine with a superloop

pito,

Given a choice of a supperloop or two simple tasks, scientists who write device code choose the simpler more reliable task model. A RTOS provides better partitioning of an application even in simple cases.

Old EEs make the funny spaghetti supperloop that mixes timing and code for two distinct operations.

I think experimental physics groups adopt things like RTOSs more readily because the learning overhead is shared. Members help each other, the first person to learn the system helps the next person.

A RTOS is not like learning a new programming language. It requires a different architecture for embedded systems. That's why modern embedded systems text books don't emphasize details of the OS. These books cover things like why preemption is required for rate monotonic scheduling and why this is important for reliable systems.

I find it hard to understand the resistance to use of RTOSs.

Well, there's one set of people who doesn't understand what an RTOS is, what it would give them, or how they'd choose between multiple options. They have enough problems figuring out how to divide a program into functions, much less into concurrent "tasks."

There's another set that understands, but is worried about the complexities that you know or suspect comes with it. Perhaps they've been burnt by a negative experience with an existing RTOS. Or they're worried that they don't want to increase latency to get certainty. Or they're just comfortable, given the size of Arduino, that they can get along without it. Case in point:

CERN has used LynxOS in control systems for over twenty years.
Some of my colleagues left the lab to develop VxWorks.

So given 20 years of experience with an RTOS, your coworkers were so frustrated with it that they went to work on a different RTOS ? :slight_smile:

A RTOS is not like learning a new programming language.

The hell it isn't. Especially if the product is already intentionally blurring the lines between "language", "library", and "run time environment."

westfw,

So given 20 years of experience with an RTOS, your coworkers were so frustrated with it that they went to work on a different RTOS ? :slight_smile:

Wrong! They didn't work on a different system, they commercialized the open Berkeley system as VxWorks and NASA started using it. CERN picked LynxOS in Europe at about the same time. CERN and LBNL are happy with their systems.

VxWorks now has over a billion copies in products. Several other commercial RTOSs also claim over a billion copies in products.

RTOSs aren't that different in basic functions. Almost all commercial RTOSs have a fixed priority preemptive scheduler. They have similar synchronization/communication primitives. I find it easy, almost mechanical to convert a program from one to another.

You can't choose an RTOS because you don't understand the associated theory. That's what engineers learn in courses like UC Berkeley's EECS 149.

You comments are valuable. I get the message that you will never use a RTOS.

At least you are not like the old engineer I knew who programmed a ROM for one of the first micro-controllers by filling in squares on a engineering pad and then entering them in switches in his home made programmer. I never got him to use an assembler. That was around 1971 or 1972 and it might have been a 4-bit 4004. We quickly moved to the 8008 but left the old EE with his pad behind.

What kind of rtos did you use with 8008?

Given a choice of a supperloop or two simple tasks

I know which a Hobbit would choose.

PaulS,

Sorry, I depend too much on seeing red for spelling errors but it doesn't work for superloop/supperloop.

pito,

We did some prototyping on the 8008 and decided it wasn't flexible enough. We were in contact with MOS Technology which was formed in 1972 so we moved to the 6502.

We built a little kernel and attached lots of RAM since there was no flash then. We didn't name the system.

Of course Apple used the same chip for the Apple I in 1976 and Apple II in 1977.

We attached these systems to a serial port on a terminal server and downloaded programs from a CDC 6600 supercomputer. We programmed in PL/M, not assembler. The cross compiler ran on the 6600.

One reason I find Arduino interesting is that its technology is so much like what I was doing 40 years ago.

By 1980 commercial kernels like VRTX were available for chips like the 68000 and we moved beyond Arduino style systems.

I'm probably not a typical Arduino enthusiast, but I almost immediately looked for a scheduler for my first project. Maybe out of preference because I know I could do my project with a super loop, and even an RTOS, but all I really needed was a scheduler with inter-process communications. I say "need" because that is my preference for this particular project.

In my retirement I'm building a smart toy for my granddaughter that will have up to 8 I/O's that need to be handled asynchronously in "real time". I could not find a simple scheduler (and I did ask on the forum as well) so I adapted the Quantum Leaps code and built what I call an asynchronous (non pre-emptive) device framework on top of it. It is more than adequate for my project needs, and is scalable for future projects. I admit that I had fun building the framework, but I would have easily used something already available.

Because I feel strongly that a scheduler is a good tool for solving specific problems, and I was unable to find one, I wrote about how I adapted QF to make it easier for others to do the same:

There are a wide range of user skills and project complexities in this forum. The hobbyist that needs to blink some LED's isn't going to need a full blown scheduler or an RTOS, and someone writing code for particle collisions probably doesn't want one; but there are some of us here in the middle that would benefit from such a scheduler, whether a kid toy project or something more demanding. Absolutely, yes.

ddmcf,

I agree, the Quantum Leaps state machine framework is an excellent type of scheduler for applications like your project.

What kind of feedback have you received? Is your tutorial sufficient to get people started?

I am curious how Arduino users react to state machines.

Students are introduced to finite state machines early in introductory embedded systems courses.

I get the message that you will never use a RTOS.

I wouldn't go that far. I can't see filling up the memory of a MEGA or DUE without more structure (in the form of SOME sort of OS.) I doubt whether I'd ever need the "real time" aspects, but that seems to be what is available; I can't see writing my own, especially when the existing rtoses are getting pretty favorable reviews.

Most of my professional career was spent programming under a proprietary, non-preemptive, not real-time OS. OTOH, that company's experiments with real RT kernels was less than spectacularly successful. When your uart ISR starts sending messages instead of just reading the chip, something has "jumped the shark."

fat16lib,

I've gotten some good feedback in the form of "thanks for writing this up", but no specific additional questions on the QF hack. I've been pleased with the traffic to the specific blog post on state machines, it seems to be gaining momentum over the last month, perhaps in light of recent talk about schedulers like this.

Thanks for this post, I believe it builds awareness of some of the tools that are available for solving various control problems.

to answer the question
yes it dosn't

depends what your doing with the thing.

would be nice if a simple rtos was available,

as for the comment back a while about the good old 8008,
the work on the apollo computer used a rtos / interupt / schedular system,
its what saved the 11 mission when buzz left the return to orbit radar on as well as the landing radar.

http://ed-thelen.org/comp-hist/vs-mit-apollo-guidance.html

have fun

Would an extremely small preemptive RTOS appeal to Arduino users?

I have been playing with an experimental RTOS written by Giovanni Di Sirio, the author of ChibiOS/RT.

His goal is to build the smallest possible kernel for tiny chips. Giovanni calls the system Nil RTOS since the goal is a zero size kernel.

Nil RTOS has only the most fundamental functionality.

A preemptive fixed priority scheduler.

Counting semaphores that can signal from a thread or ISR.

Sleep until a specified time and sleep for a specified period.

Here is an example sketch with a total size under 2KB on an Uno:

// Connect a scope to pin 13.
// Measure difference in time between first pulse with no context switch
// and second pulse started in thread 2 and ended in thread 1.
// Difference should be about 10 usec on a 16 MHz 328 Arduino.
#include <NilRTOS.h>

const uint8_t LED_PIN = 13;

// Semaphore used to trigger a context switch.
Semaphore sem = {0};
//------------------------------------------------------------------------------
/*
 * Thread 1 - high priority thread to set pin low.
 */
NIL_WORKING_AREA(waThread1, 128);
NIL_THREAD(Thread1, arg) {

  while (TRUE) {
    // wait for semaphore signal
    nilSemWait(&sem);
    // set pin low
    digitalWrite(LED_PIN, LOW);
  }
}
//------------------------------------------------------------------------------
/*
 * Thread 2 - lower priority thread to toggle LED and trigger thread 1.
 */
NIL_WORKING_AREA(waThread2, 128);
NIL_THREAD(Thread2, arg) {

  pinMode(LED_PIN, OUTPUT);
  while (TRUE) {
    // first pulse to get time with no context switch
    digitalWrite(LED_PIN, HIGH);
    digitalWrite(LED_PIN, LOW);
    // start second pulse
    digitalWrite(LED_PIN, HIGH);
    // trigger context switch for task that ends pulse
    nilSemSignal(&sem);
    // sleep until next tick (1024 microseconds tick on Arduino)
    nilThdSleep(1);
  }
}
//------------------------------------------------------------------------------
/*
 * Threads static table, one entry per thread. Thread priority is determined
 * by position in table.
 */
NIL_THREADS_TABLE_BEGIN()
NIL_THREADS_TABLE_ENTRY("thread1", Thread1, NULL, waThread1, sizeof(waThread1))
NIL_THREADS_TABLE_ENTRY("thread2", Thread2, NULL, waThread2, sizeof(waThread2))
NIL_THREADS_TABLE_END()
//------------------------------------------------------------------------------
void setup() {
  // Start nil.
  nilBegin();
}
//------------------------------------------------------------------------------
void loop() {
  // Not used.
}

I wrote this sketch to determine the performance of Nil RTOS. I was amazed to find how fast it is. The time to signal a semaphore, do a contex switch and take the semaphore is only about 12 microseconds on an Uno.

I have read all of this thread and found it very interesting. I worked on powertrain controllers as a consultant for brands x and y. They both had a mechanical engineering mind set in the early days and put any old EE on the coding for the controllers. Glad to say that is no longer the case. Some went kicking and screaming from absolute assembly to relocatable assembly. Then the same with going to "C". And, again for a RTOS. At this time they are doing model based control algorithms and auto code generation with an RTOS. There were safety concerns about allowing higher level interrupts interrupt lower level interrupts. It was believed that circumstances might arise that could not be reliably predicted. I disagreed, but I was probably wrong. A friend working at Wind River was involved in the code for the Mars rovers. You may recall the first rover froze up. The software locked up. The problem was task scheduling in the very complex multitasking RTOS. Luckily, or cleverly, they had some code in the system that recognized how much trouble it was in and went into a mode to accept new code by telemetry. So, by the time the second rover landed, they had the fix. I am an old engineer 65+ and age is not the problem, mind set is. You are either open to new ideas, or not. I have a hobby farm and I am retired, so my current project is to use a Raspberry Pi(RPI)... (google) and an attached I/O board with an ATmega328p on it. The ATmega is much more powerful that the first chips we ran the engine with which was (1K RAM, 16K ROM, and 2MHz). We used a 10msec interrupt to read the tone wheels for RPM and schedule background tasks. It was a rudimentary O/S for scheduling and accurate sensor reading. The building security/monitoring/controlling I will do for three out buildings on the property will use a similar rudimentary O/S. The ATmega is a slave to the RPI on a 115K serial UART channel. The RPI is a powerful processor with 512K RAM and SD flash ROM (8Gig) with netork connector and two USB connectors. I will run WiFi on one of the USB ports. The RPI runs Debian Linux and can do an Apache server, if you wish. The cost is very low. The RPI is $35 and the Gert I/O board is $48. I was working on computerized test equipment at Bell Labs when the Intel 4004 and 8008 came out. We were doing 148 pin circuit board testers and wanted to go to a processor per pin, but the 8008 and 4004 were not fast enough or powerful enough. We stayed with DEC and Data General minicomputers. Anyway, this thread was very interesting and I think that the power of the Arduino boards does warrant the use of a periodic interrupt and simple task scheduler for many applications. I have used the 'MicroC/OS-II' real-time kernel by Jean J. Labrosse, which has been ported to many uPs. It does a fine job and allows the user to pick features and leave out features to arrive at the proper size and power. Worked well for me on my greenhouse controller, which reports over the internet. I have seen another similar featured RTOS called 'freeRTOS'...(google) that costs nothing. I think it would be useful for some to look at these. Have fun :D.

Much was made of the Mars rover bug but it was just that, a bug. Like all bugs, it shouldn't have happened since the proper design for avoiding "priority inversion" was well known since the early 1970s. The Mars rover problem happened in 1997.

When created, a VxWorks mutex object accepts a boolean parameter that indicates whether priority inheritance should be performed by the mutex. The mutex in question had been initialized with the parameter off; had it been on, the low-priority meteorological thread would have inherited the priority of the high-priority data bus thread blocked on it while it held the mutex, causing it be scheduled with higher priority than the medium-priority communications task, thus preventing the priority inversion. Once diagnosed, it was clear to the JPL engineers that using priority inheritance would prevent the resets they were seeing.

I did become part of the fear factor of using a RTOS. Like this kind of misinformation - "an RTOS shouldn't be used on an Uno since the overhead is too high".

Here is a simple case study for an example I am developing. The problem is to read data from analog pins at regular intervals and write it to an SD card.

The simple solution is a loop like this:

  1. Wait till start of period.

  2. Read data.

  3. Write data to SD.

  4. Repeat.

A problem occurs when the period between points is less than about 100 milliseconds. SD cards can have occasional latencies of over 100 milliseconds so data overruns occur.

A possible solution is to use an RTOS with two threads. Can the Uno support the extra overhead?

The answer is that the RTOS solution is far more efficient than the above loop. Here's why.

The RTOS solution has two threads.

The analog read thread runs at high priority and is a loop like this:

  1. Wait till start of period.

  2. Read data.

  3. Write data to a FIFO buffer.

  4. Repeat.

The SD write thread is a loop that runs at lower priority.

  1. Wait for data in the FIFO.

  2. Write data to SD

  3. Repeat.

The two thread solution is more efficient than the first single loop solution. CPU time is recovered when the SD is busy and the higher priority thread is scheduled.

Now comes the real payoff. The Arduino analogRead() take about 115 microseconds. Almost all of this is in a busy loop waiting for the ADC conversion.

I wrote an RTOS based replacement for analogRead() that is transparent to users but sleeps during the ADC conversion. This saves over 90 microseconds of CPU time per read after factoring in a context switch.

The result is that the RTOS version can log more than twice as fast and doesn't suffer data overruns. The simple loop version has the high overhead of busy loops that the RTOS avoids.

I have ported three RTOSs to Arduino Google Code Archive - Long-term storage for Google Code Project Hosting..

My favorite for Uno is NilRTOS. Its author, Giovanni Di Sirio, says it's "Smaller than ChibiOS/RT, so small it's almost nil."

Interesting thread. I have been working on commercial hard- and soft- real-time control systems for a number of years. Since those terms are commonly misused and misunderstood here is an example of a hard-time system we did a few years ago: a mud-pump controller driving two pistons with 2-meter stroke, controlled with about 500 HP of hydraulic pumps (at 10,000 psi, the manifold pipe is something like 200 mm in dia), through a servovalve for each piston. Each piston moves by a polynomial equation and their outputs are summed through a check valve so that the flow is constant, and adjustable over a wide range. LVDTs with 2-meter stroke monitor each piston. Smooth startup and shutdown and several considerations of fail-safe behavior were used. We achieved this on an 8051 with external ADCs and DACs, and interrupt handlers (two levels, pre-emptive) with re-entrant libraries. We almost ran out of code space. In our case the time interval was not tiny, but we had hard completion deadlines. The piston profile came out of a lookup table (actually 1/4 of the entire profile, mirrored and phase-shifted as needed), the actual piston position was read, and that was all fed into a firmware PID routine which then calculated the next value for the servovalves. That had to be completed before the timer tick to update the valves. That timer got faster as the flow rate was increased. Then in the background was interrupt-driven serial I/O for the machine interface to control the mud pump and to monitor critical performance values (such as the current error value vs target position). This I/O had to be safely pre-empted by the piston routines with due consideration of atomicity of variables which might be in the midst of being updated by the control interface. In the end it all worked well, we delivered complete documentation and source code, and I have not heard of any problems. I went to Japan to help install and tune the system and then our part of the project was complete.

Where execution deadlines got tight I would set and clear some spare I/O bits and watch them on an oscilloscope or logic analyzer, something like High when active in a critical routine and Low when in safe extra time margin. As the flow rate ramped up you could see the bit transitions coming closer together. If they ever collided that would be potentially catastrophic since the control loop could not correctly function. We set the scope to trigger on the smallest safe interval and left it running overnight (without driving the actual pumps - simulated input). There was also simple instrumentation in the code to log deadline violations.

The project engineer was a delight to work with: a very experienced, practical guy. One example: we had to monitor the end-of travel limit clearance of the pistons (which weighed over 1000 kg as I recall) since we didn't ever want to drive them into their stops. No one knew what would happen if that occurred. Safe clearance was something like 5-10 mm, and it could not be observed easily by eye while running. His brilliant idea was to use an empty aluminum soft drink can in the gap and measure the crushed thickness. Worked great and no risk of harm to the pistons.

In this project a simple RTOS might have been a huge timesaver. But we also had to have timer interrupts and serial I/O interrupts and I am not aware of how easily RTOSes can fit with those. If it is possible to weave RTOS features into your own needed I/O hardware support, that could be helpful but also could get a bit complex.

I am using Teensy++2 on a project now and just got the ARM Teensy 3, and there is enough code and data space on these new Arduino devices. My concern would be: is the RTOS granular so we only need to use what fits our case, and can it co-exist with I/O device interrupt handlers?

Thanks