NilRTOS - A Fast Tiny Preemptive RTOS

And the same with 2048usec period (Time is *2 msecs). You may see how the freefifo dropped to 69 during the highest latency hit (40msec).

type any character to begin
type any character to end
FIFO record count: 90
Minimum free record count: 69
Maximum SD write latency: 40292 usec
Unused Stack: 51 151


I wrapped the FIFO code in a C++ template so it is easy to use. Here is an example:

#include <NilRTOS.h>
#include <NilFIFO.h>

// FIFO with ten ints
NilFIFO<int, 10> fifo;

// Use tiny unbuffered NilRTOS NilSerial library.
#include <NilSerial.h>
#define Serial NilSerial

NIL_WORKING_AREA(waThread1, 64);

NIL_THREAD(Thread1, arg) {
  int n = 0;
  while (TRUE) {

    int* p = fifo.waitFree(TIME_IMMEDIATE);
    // continue if no free space
    if (p == 0) continue;
    *p = n++;
NIL_THREADS_TABLE_ENTRY(NULL, Thread1, NULL, waThread1, sizeof(waThread1))
void setup() {

  // start kernel
void loop() {
  int* p = fifo.waitData(TIME_IMMEDIATE);
  // return if no data
  if (p == 0) return;
  int n = *p;


  if (n == 20) {

Here are the last lines of output:

FIFO record count: 10
Minimum free count: 9

I attached the template as NilFIFO.h. It’s not documented yet

I downloaded your new diagrams for a future update of the documentation.

The SD performance is interesting. SD controllers are impossible to understand since they are a trade top secret.
Users could benefit from your plots, I should write an article to include with SdFat. My to-do list is getting so long.

NilFIFO.h (1.3 KB) a very first step I would strongly recommend to ask kindly the maintainers to open a new item on this forum ie. in Topics or Development - "RTOS", as it is clear the topic will cover a very important concept all arduino users may easily benefit from.. PS: and maybe to move all the rtos-related topics there..

I've tried with 1284p (16MHz) and 1024us period. Fifosize=10kBytes, 2minutes logging time. 5ADC channels:

FIFO record count: 833
Minimum free record count: 0
Maximum SD write latency: 48860 usec
Unused Stack: 51 5341

** overrun errors **
Maximum overrun count: 44

4ADC channels:

FIFO record count: 1000
Minimum free record count: 0
Maximum SD write latency: 55124 usec
Unused Stack: 51 5354

** overrun errors **
Maximum overrun count: 11

3ADC channels:

FIFO record count: 1250
Minimum free record count: 1115
Maximum SD write latency: 49640 usec
Unused Stack: 51 5354

We need a faster ADCs and SDcards :)

8ADC channels and 2048us period:

FIFO record count: 555
Minimum free record count: 499
Maximum SD write latency: 49764 usec
Unused Stack: 51 5364

First, I just posted a new NilRTOS Look at the new nilSdLogger.ino example, it uses the FIFO template and NilTimer1.

The problem is CPU power to format numbers for these examples.

My new SdFat printField() is about three times faster than Arduino Print but is still the bottleneck when you have lots of RAM.

FIFO record count: 833 Minimum free record count: 0 Maximum SD write latency: 48860 usec Unused Stack: 51 5341

The 833 FIFO records at 1024 usec provides about 850 milliseconds of buffering. The max latency was under 50 milliseconds so text formatting is the problem.

With the latest: 6ADC channels, 1000usec period:

Max Write Latency: 47624 usec
Unused Stack: 53 5354
FIFO record count: 833
Minimum free count: 724

Is it faster or am I doing something wrong? :) Yea, I did - almost all ADC data were "0" so the print to file was fast enough..


Here is a simple Arduino sketch that logs ADCs as fast as possible with no regard for overruns.

//Dummy Arduino Logger
#include <SdFat.h>
SdFat sd;

SdFile file;

const uint16_t NREC = 4000;
const uint8_t NADC = 2;
const uint8_t sdChipSelect = SS;
void setup() {

  Serial.println(F("type any character to begin"));
  while ( < 0);
  // Initialize SD and create or open and truncate the data file.
  if (!sd.begin(sdChipSelect)
    || !"DATA.CSV", O_CREAT | O_WRITE | O_TRUNC)) {
    Serial.println(F("SD problem"));

  uint32_t t = micros();
  for (uint16_t r = 0; r < NREC; r++) {
    for (int i = 0; i < NADC; i++) {
    // Fake overrun field.
  t = micros() - t;
  Serial.print("NADC: ");
  Serial.print("Average interval: ");
  Serial.println(" usec");
void loop() {}

Here is the output for two ADC channels.

type any character to begin
Average interval: 1219 usec

It ignores overruns but can’t log two ADC channels at 1024 usec per record. Of course the rate is dependent on the ADC values and I tied channel zero to 5V.

With nilSdLogger and four channels at 1000 usec:

type any character to begin
type any character to end
Max Write Latency: 60940 usec
Unused Stack: 53 108
FIFO record count: 118
Minimum free count: 56


I can log the six Uno analog pins at 1000 Hz. I wrote a printHexField() function that runs much faster than the decimal version.

You can't go faster than this because of the ADC conversion time. Also the amount of buffering on Uno is marginal for six ADCs.

I may try a faster ADC clock on a Mega.

Here is the finish message:

Done! Max Write Latency: 47748 usec Unused Stack: 53 110 FIFO record count: 79 Minimum free count: 30

Here is the Hex data file:

PERIOD_USEC,1000 ADC0,ADC1,ADC2,ADC3,ADC4,ADC5,Overruns 3FF,2DA,238,208,3FF,3FF,0 3FF,30E,2AC,267,3FF,3FF,0 3FF,31B,2E6,2AB,3FF,3FF,0 3FF,32F,30D,2DE,3FF,3FF,0 3FF,369,342,314,3FF,3FF,0 3FF,3A6,37C,34B,3FE,3FF,0 3FF,3D6,3B1,381,3FF,3FF,0 3FF,3FF,3F0,3BB,3FF,3FF,0 3FF,3FF,3FF,3E0,3FF,3FF,0 3FF,3FF,3FF,3F3,3FF,3FF,0 3FF,3FF,3FF,3FC,3FF,3FF,0 3FF,3FF,3FF,3FF,3FF,3FF,0

Pin zero is tied to 5V, pins 1,2,3 are floating and pins 4,5 are connected to a DS1307 with pull ups.

You may use a simple data compression - do record changes per channel only, when no change "0" will be written.. :) :)

I don't think compression would help. The problem is CPU for formatting, not data size, so simple is better.

The printHexField() is six times faster than Arduino Print. 1.51 seconds vs 9.63 seconds for the same data. The Hex file is a little smaller but that is not a big factor.

Test of println(uint16_t) Time 9.63 sec File size 128.89 KB Write 13.38 KB/sec Maximum latency: 47552 usec, Minimum Latency: 176 usec, Avg Latency: 474 usec

Test of printHexField(uint16_t, char) Time 1.51 sec File size 115.63 KB Write 76.63 KB/sec Maximum latency: 24132 usec, Minimum Latency: 44 usec, Avg Latency: 68 usec

A real speedup would happen with a properly designed binary logger. That would require a fast external ADC.

I have a really fast library for MCP300x and MCP320x ADCs. Conversion happens in parallel with readout. It takes about 10 microseconds to read a 12 bit ADC value with my fast bit-bang driver.

Yes, hex is faster.. A little bit laborious when working with .csv afterwards. For example 2E6 in your above hex file example would be a pain to process in excel :) I am thinking how to proceed with I2C. I have got 10DOF IMU (I2C) and want to log data from accelerometer, gyro, magnetometer, barometer and few ADC channels, ie. in 10ms period. Getting timestamps synced by RTC (I2C) and writing files with max length of 65k lines. And not loosing single data record.. ;) Do we need an I2C driver for the NilRtos?

Hex is a pain. I have logged in binary and converted the file to text with a second pass. works this way. It can log 8-bit ADC samples on an Uno at up to 100 ksps. you get about 7 effective bits.

I have a very fast small AVR I2C driver that I plan to include with Nil. It is master only and much faster than Wire. It doesn't use interrupts and would work well in a high priority thread.

It doesn't use interrupts and would work well in a high priority thread.

To get data from one sensor may take ~1.2msecs (@100kHz).. Would such blocking driver be working with nil properly?

PS: I've tried to log the BMP085 bar sensor with current wire lib. Reading the pressure:

p->barpress = (uint32_t)bmp.readPressure();

takes 34.5ms @16MHz (inclusive a lot of integer math in it). So log periods >37ms work w/o an overrun. I think I maybe need a thread where I read all the i2c sensors in a loop, lp smoothing data and updating them, as fast as possible. And the measuring thread will just copy the actual i2c sensors' data into the Record_t.

I think an interrupt based driver would help at 100 kHz.

Writing an interrupt base I2C driver that sleeps may not be too hard.


I started looking at a general framework for I/O drivers on Nil. It needs to accommodate device sharing.

It's likely more than one thread will want to use I2C or ADC channels.

Not sure if I ought to be starting a new thread, rather than continue with this old one - but here goes!

NilRTOS looks promising for my project. I've got it more or less working with SCoopME, but want to see if NilRTOS is better suited, and more compact.

I've noted in all the examples the warning:

"Loop is the idle thread. The idle thread must not invoke any kernel primitive able to change its state to not runnable"

I'm not sure what this means and its implications for my project! My existing code has functions that access an LCD, and functions accessing a few digital sensors.

What sort of functions do I need to look out for, please?


How do i allocate memory on the heap within a task? Malloc and new always returs null unless you delete malloc.c in the Arduino folder.

How come and is that a solution?


"..Static architecture, everything is statically allocated at compile time.."


Yes so? It just says that nil is static allocated.

a) I am having some rapid success with NilRTOS.

b) But I am not seeing any evidence of a preemptive ability -- i.e. one thread interrupting another. It seems to be cooperative, which is fine, but very different. By this, I mean, you must surrender control (eg sleep) from Task B before a higher-priority Task A can run.

Maybe I am wrong, and the documentation jungle is fooling me, but does anyone else have an opinion on this?

c) The documentation is difficult because it is vague. There are rather many functions or macros that seem to do the same thing, with no reason why they exist separately. Not even a single letter of difference in their descriptions. All in all, the documentation is the greatest obstacle.

d) I would currently say that I am restricting my code to perhaps 5% of the apparent command set where the examples definitely indicate appropriate usage.


Interrupt-driven preemptive multitasking would be nice, but cooperative multitasking can be easier for newbies to grok.

As far as documentation goes, it's an open-source project. I'm planning to peek into it soon, and my plans for the immediate future are to try documenting the code with examples. Having a functional "RTOS" for Arduino is a wonderful gift to the community, and as the old saying goes, I'm not going to look this gift horse in the mouth :D