Does Arduino need a real-time scheduler?

Would an extremely small preemptive RTOS appeal to Arduino users?

I have been playing with an experimental RTOS written by Giovanni Di Sirio, the author of ChibiOS/RT.

His goal is to build the smallest possible kernel for tiny chips. Giovanni calls the system Nil RTOS since the goal is a zero size kernel.

Nil RTOS has only the most fundamental functionality.

A preemptive fixed priority scheduler.

Counting semaphores that can signal from a thread or ISR.

Sleep until a specified time and sleep for a specified period.

Here is an example sketch with a total size under 2KB on an Uno:

// Connect a scope to pin 13.
// Measure difference in time between first pulse with no context switch
// and second pulse started in thread 2 and ended in thread 1.
// Difference should be about 10 usec on a 16 MHz 328 Arduino.
#include <NilRTOS.h>

const uint8_t LED_PIN = 13;

// Semaphore used to trigger a context switch.
Semaphore sem = {0};
//------------------------------------------------------------------------------
/*
 * Thread 1 - high priority thread to set pin low.
 */
NIL_WORKING_AREA(waThread1, 128);
NIL_THREAD(Thread1, arg) {

  while (TRUE) {
    // wait for semaphore signal
    nilSemWait(&sem);
    // set pin low
    digitalWrite(LED_PIN, LOW);
  }
}
//------------------------------------------------------------------------------
/*
 * Thread 2 - lower priority thread to toggle LED and trigger thread 1.
 */
NIL_WORKING_AREA(waThread2, 128);
NIL_THREAD(Thread2, arg) {

  pinMode(LED_PIN, OUTPUT);
  while (TRUE) {
    // first pulse to get time with no context switch
    digitalWrite(LED_PIN, HIGH);
    digitalWrite(LED_PIN, LOW);
    // start second pulse
    digitalWrite(LED_PIN, HIGH);
    // trigger context switch for task that ends pulse
    nilSemSignal(&sem);
    // sleep until next tick (1024 microseconds tick on Arduino)
    nilThdSleep(1);
  }
}
//------------------------------------------------------------------------------
/*
 * Threads static table, one entry per thread. Thread priority is determined
 * by position in table.
 */
NIL_THREADS_TABLE_BEGIN()
NIL_THREADS_TABLE_ENTRY("thread1", Thread1, NULL, waThread1, sizeof(waThread1))
NIL_THREADS_TABLE_ENTRY("thread2", Thread2, NULL, waThread2, sizeof(waThread2))
NIL_THREADS_TABLE_END()
//------------------------------------------------------------------------------
void setup() {
  // Start nil.
  nilBegin();
}
//------------------------------------------------------------------------------
void loop() {
  // Not used.
}

I wrote this sketch to determine the performance of Nil RTOS. I was amazed to find how fast it is. The time to signal a semaphore, do a contex switch and take the semaphore is only about 12 microseconds on an Uno.