runningMedian class on playground

A question in this forum - http://arduino.cc/forum/index.php/board,3.0.html - triggered me to implement a small runningMedian class. This can be found on - http://arduino.cc/playground/Main/RunningMedian. A sample sketch shows how one can use this class.

The median is defined as the middle value of an array of sorted values. The two main advantages of using the median above an average is that it is not influenced by a single outlier and it allways represent a real measurement. On the other hand by averaging one could create extra precision. As all things in the world this library is eternally under construction :slight_smile:

Please post comments and improvements in this thread,

thanks,
Rob

Please post comments and improvements in this thread,

You asked for it. :wink:

First... Thank you for the contribution.

void RunningMedian::add(long value)
{
...
	_idx %= _size;

...modulus is expensive. Use if instead...

void RunningMedian::add(long value)
{
...
	if ( _idx >= _size )
		_idx = 0;
protected:
	int _size;
	int _cnt;
	int _idx;

It's very unlikely the size, count, and index need to be a full integer. An 8-bit data-type is more appropriate. You may have to continue using a signed data-type.

Given the fact that _size is a fixed size, it should be turned into a static const.

I don't think it's necessary to perform a complete sort. I believe a heap would get the result you want.

I believe 5 is a very good choice for _size. Given the fact that _size is fixed at 5, you may want to "unroll" the loops. I suspect the total code size will be reduced and the execution time will obviously improve.

Thanks for the remarks:

  • % removed
  • _cnt, _idx => uint8_t
  • _size => static const uint8_t

I know you are right about the sorting, it is enough as half+1 of the array is sorted.

I don't unroll the loops as it would be take extra maintenance if the internal array becomes 7 or 9 or ...
maybe I better turn _size into a #define

food for thought, thanx

This is a great benefit by dealing with sensor data. Especially due to the fact that the median is really robust against outlier!

How can I move the hardcoded 5 items value for computing the median from RunningMedian.h file to the scetch? Im am unfamiliar with modifying or overrice values from a library, so be patient with me. :wink: I can change the line "#define MEDIAN_SIZE 5" to "#define MEDIAN_SIZE 11" but it would be more convenient to do this not in the library directly but in the sketch. Can I use

{

#include <RunningMedian.h>
#define MEDIAN_SIZE 11

RunningMedian samples = RunningMedian();

void setup()
...

It would be great if I can not set MEDIAN_SIZE globally--for all calculated median the same array lenght--but also different MEDIAN_SIZE if I deal with different and more than one sensor input.

Thanks for your interest, The median is indeed robust against outliers, and you can still do a running average to smooth the values additionally.

The advantage of the #define construct is that the intenal data structure is allocated compile time. If you use a parameter you want to allocate this runtime. Or allocate a fixed - and too large - array compile time and use only a part of it. Something like this:

#include <RunningMedian.h>

RunningMedian samples1 = RunningMedian(11);
RunningMedian samples2 = RunningMedian(5);

void setup()
...

Arduino and dynamic memory are not the bests friend imho (it works with a but or 2) Unfortunately I'm quite busy at the moment to write (and test) a parameterized version of the lib....

Updated the RunningMedian library to 0.1.02 on the playground - http://arduino.cc/playground/Main/RunningMedian - today:

  • constructor with size which should be between 1 and 19, if no param is given it defaults to size =5;
    (fized size arrays internally)

  • default type to float (can be made long quite easily)

  • extra functions getLowest() and getHighest()

  • extra function getAverage() of the buffer

  • a test if no values are added, the functions will return NAN to indicate error.

The new functions can easily be commented out (in .h and .CPP) to minimize the footprint of the class.

as allways remarks are welcome

Hi Rob, that is great, thanks a lot!

Updated the RunningMedian library to 0.2.00 on the playground - http://arduino.cc/playground/Main/RunningMedian - today:

Thanks go to Ronny for his work on the new templated version!

changes:

  • added a template version so it can be used for any datatype (to be tested more thoroughly :slight_smile:
  • added int getSize() to see the buffer size.
  • added int getCount() to see how far the internal buffer is filled.

There are differences between the versions, breaking backwards compatibility.

The 0.1.02 version (non template) is still available.

as allways remarks are welcome

Hello,

I did a new implementation using the same ideas, less features but with improved performance.

//
// Released to the public domain
//
// Remarks:
// This is a lean but fast version. 
// Initially, the buffer is filled with a "default_value". To get real median values
// you have to fill the object with N values, where N is the size of the sliding window.
// For example: for(int i=0; i < 32; i++) myMedian.addValue(readSensor());
// 
// Constructor: 
// FastRunningMedian<datatype_of_content, size_of_sliding_window, default_value> 
// maximim size_of_sliding_window is 255
// Methods:
// addValue(val) adds a new value to the buffers (and kicks the oldest)
// getMedian() returns the current median value
//
//
// Usage: 
// #include "FastRunningMedian.h"
// FastRunningMedian<unsigned int,32, 0> myMedian;
// ....
// myMedian.addValue(value); // adds a value
// m = myMedian.getMedian(); // retieves the median
//

#include <inttypes.h>

template <typename T, uint8_t N, T default_value> class FastRunningMedian {

public:
	FastRunningMedian() {
		_buffer_ptr = N;
		_window_size = N;
		_median_ptr = N/2;

		// Init buffers
		uint8_t i = _window_size;
		while( i > 0 ) {
			i--;
			_inbuffer[i] = default_value;
			_sortbuffer[i] = default_value;
		}
	};

	T getMedian() {
		// buffers are always sorted.
		return _sortbuffer[_median_ptr];
	}

	
	void addValue(T new_value) {
		// comparision with 0 is fast, so we decrement _buffer_ptr
		if (_buffer_ptr == 0)
			_buffer_ptr = _window_size;
		
		_buffer_ptr--;
		
		T old_value = _inbuffer[_buffer_ptr]; // retrieve the old value to be replaced
		if (new_value == old_value) 		  // if the value is unchanged, do nothing
			return;
		
		_inbuffer[_buffer_ptr] = new_value;  // fill the new value in the cyclic buffer
			
		// search the old_value in the sorted buffer
		uint8_t i = _window_size;
		while(i > 0) {
			i--;
			if (old_value == _sortbuffer[i])
				break;
		}
		
		// i is the index of the old_value in the sorted buffer
		_sortbuffer[i] = new_value; // replace the value 

		// the sortbuffer is always sorted, except the [i]-element..
		if (new_value > old_value) {
			//  if the new value is bigger than the old one, make a bubble sort upwards
			for(uint8_t p=i, q=i+1; q < _window_size; p++, q++) {
				// bubble sort step
				if (_sortbuffer[p] > _sortbuffer[q]) {
					T tmp = _sortbuffer[p];
					_sortbuffer[p] = _sortbuffer[q];
					_sortbuffer[q] = tmp;
				} else {
					// done ! - found the right place
					return;
				}
			}
		} else {
			// else new_value is smaller than the old one, bubble downwards
			for(int p=i-1, q=i; q > 0; p--, q--) {
				if (_sortbuffer[p] > _sortbuffer[q]) {
					T tmp = _sortbuffer[p];
					_sortbuffer[p] = _sortbuffer[q];
					_sortbuffer[q] = tmp;
				} else {
					// done !
					return;
				}
			}
		}
	}
	
private:
	// Pointer to the last added element in _inbuffer
	uint8_t _buffer_ptr;
	// sliding window size
	uint8_t _window_size;
	// position of the median value in _sortbuffer
	uint8_t _median_ptr;

	// cyclic buffer for incoming values
	T _inbuffer[N];
	// sorted buffer
	T _sortbuffer[N];
};

#endif
// --- END OF FILE ---

test code:

#include "FastRunningMedian.h"
#include "RunningMedianStatic.h"

unsigned int value = 0;

FastRunningMedian<unsigned int,32, 0> newMedian;
RunningMedianStatic staticMedian = RunningMedianStatic(32);

void setup ()
{
  while (!Serial) ;
  Serial.begin(9600);
  Serial.println("Starting....");
};

void loop() 
{
  unsigned int xmedian;
  unsigned int cmedian;

  // us measurement overhead is 4 us.
  value = random(0, 255);
  unsigned long loopstart = micros();
  newMedian.addValue(value);
  xmedian = newMedian.getMedian();
  unsigned long loopstop = micros();

  unsigned long fastdelay = loopstop - loopstart;
  
  loopstart = micros();
  staticMedian.add(value);
  cmedian = staticMedian.getMedian();
  loopstop = micros();
  
  unsigned long slowdelay = loopstop - loopstart;
   
  Serial.print("Loop FastTime(us)=");
  Serial.print(fastdelay);
  Serial.print(" SlowTime(us)=");
  Serial.print(slowdelay);
  Serial.print("  value=");
  Serial.print(value);
  Serial.print(" median=");
  Serial.print(xmedian);
  Serial.print(" excactmedian=");
  Serial.print(cmedian);
  Serial.println();

  delay(500);
};

Enjoy !

thanks for sharing,

I do not have time to test it , 3 questions:

Can you post the output ?
how about the footprint? smaller / larger same?

Rob
Thanks for sharing. I think I will use this one one day.

robtillaart:
I do not have time to test it , 3 questions:
Can you post the output ?
how about the footprint? smaller / larger same?

A sample output is:

Loop FastTime(us)=28 SlowTime(us)=64  value=197 median=0 excactmedian=175
Loop FastTime(us)=24 SlowTime(us)=92  value=114 median=0 excactmedian=158
Loop FastTime(us)=24 SlowTime(us)=104  value=68 median=0 excactmedian=158
Loop FastTime(us)=36 SlowTime(us)=124  value=188 median=0 excactmedian=158
Loop FastTime(us)=28 SlowTime(us)=144  value=109 median=0 excactmedian=158
Loop FastTime(us)=32 SlowTime(us)=168  value=120 median=0 excactmedian=120
Loop FastTime(us)=32 SlowTime(us)=196  value=80 median=0 excactmedian=120
Loop FastTime(us)=32 SlowTime(us)=232  value=102 median=0 excactmedian=114
Loop FastTime(us)=32 SlowTime(us)=256  value=47 median=0 excactmedian=114
Loop FastTime(us)=36 SlowTime(us)=288  value=102 median=0 excactmedian=109
Loop FastTime(us)=52 SlowTime(us)=340  value=143 median=19 excactmedian=114
Loop FastTime(us)=52 SlowTime(us)=364  value=142 median=38 excactmedian=114
Loop FastTime(us)=36 SlowTime(us)=404  value=79 median=47 excactmedian=114
Loop FastTime(us)=64 SlowTime(us)=444  value=160 median=68 excactmedian=114
Loop FastTime(us)=40 SlowTime(us)=492  value=52 median=68 excactmedian=114
Loop FastTime(us)=32 SlowTime(us)=524  value=3 median=68 excactmedian=109
Loop FastTime(us)=68 SlowTime(us)=576  value=124 median=79 excactmedian=114
Loop FastTime(us)=60 SlowTime(us)=624  value=114 median=80 excactmedian=114
Loop FastTime(us)=36 SlowTime(us)=676  value=32 median=80 excactmedian=114
Loop FastTime(us)=52 SlowTime(us)=724  value=70 median=80 excactmedian=109
Loop FastTime(us)=40 SlowTime(us)=784  value=18 median=80 excactmedian=109
Loop FastTime(us)=96 SlowTime(us)=856  value=189 median=102 excactmedian=109
Loop FastTime(us)=88 SlowTime(us)=900  value=123 median=102 excactmedian=114
Loop FastTime(us)=76 SlowTime(us)=972  value=116 median=109 excactmedian=114
Loop FastTime(us)=108 SlowTime(us)=1036  value=190 median=114 excactmedian=114
Loop FastTime(us)=116 SlowTime(us)=1104  value=247 median=114 excactmedian=114
Loop FastTime(us)=60 SlowTime(us)=1164  value=56 median=114 excactmedian=114
Loop FastTime(us)=80 SlowTime(us)=1164  value=17 median=114 excactmedian=114
Loop FastTime(us)=96 SlowTime(us)=1168  value=157 median=114 excactmedian=114
Loop FastTime(us)=32 SlowTime(us)=1168  value=230 median=114 excactmedian=114
Loop FastTime(us)=52 SlowTime(us)=1164  value=3 median=114 excactmedian=114
Loop FastTime(us)=24 SlowTime(us)=1164  value=139 median=114 excactmedian=114
Loop FastTime(us)=56 SlowTime(us)=1164  value=79 median=114 excactmedian=114
Loop FastTime(us)=56 SlowTime(us)=1172  value=204 median=114 excactmedian=114
Loop FastTime(us)=36 SlowTime(us)=1168  value=66 median=114 excactmedian=114
Loop FastTime(us)=68 SlowTime(us)=1172  value=22 median=109 excactmedian=109
Loop FastTime(us)=60 SlowTime(us)=1164  value=167 median=114 excactmedian=114
Loop FastTime(us)=52 SlowTime(us)=1164  value=208 median=114 excactmedian=114
Loop FastTime(us)=48 SlowTime(us)=1172  value=141 median=116 excactmedian=116
Loop FastTime(us)=48 SlowTime(us)=1172  value=155 median=123 excactmedian=123
Loop FastTime(us)=64 SlowTime(us)=1172  value=125 median=124 excactmedian=124
Loop FastTime(us)=60 SlowTime(us)=1164  value=158 median=125 excactmedian=125

for the first n rounds the result values are incorrect, but afterwards they are identical. Your code takes around 1170us, mine below 100us in most cases. Your timing is more stable :slight_smile:

Size of code is almost identical, my code a tiny bit smaller. I am not sure if this depends on the data type.

Static memory and stack usage is also more or less the same.

Your add-method is fast, the median calculation is slow.
My add-method does the whole work, my median returning is fast.

If you add a lot of values and retrieve the median only seldom there is maybe a speed advantage with your method.

Thank you for your feedback - rk

Very decent analysis,
It shows a bug in my code :blush: as when there is one value or more there should also be a median (iso zero)
Thanks for pointing out.

Can you do a test with a small buffer e.g size 7 ?
I like to know how the size of internal array affects the timing of both algorithms.

robtillaart:
Can you do a test with a small buffer e.g size 7 ?
I like to know how the size of internal array affects the timing of both algorithms.

This is with buffer size 7:

Loop FastTime(us)=16 SlowTime(us)=80  value=116 median=114 excactmedian=114
Loop FastTime(us)=20 SlowTime(us)=92  value=190 median=116 excactmedian=116
Loop FastTime(us)=32 SlowTime(us)=88  value=247 median=123 excactmedian=123
Loop FastTime(us)=16 SlowTime(us)=84  value=56 median=123 excactmedian=123
Loop FastTime(us)=16 SlowTime(us)=88  value=17 median=123 excactmedian=123
Loop FastTime(us)=12 SlowTime(us)=92  value=157 median=123 excactmedian=123
Loop FastTime(us)=20 SlowTime(us)=88  value=230 median=157 excactmedian=157
Loop FastTime(us)=20 SlowTime(us)=80  value=3 median=157 excactmedian=157
Loop FastTime(us)=16 SlowTime(us)=80  value=139 median=139 excactmedian=139
Loop FastTime(us)=20 SlowTime(us)=80  value=79 median=79 excactmedian=79
Loop FastTime(us)=20 SlowTime(us)=80  value=204 median=139 excactmedian=139
Loop FastTime(us)=16 SlowTime(us)=80  value=66 median=139 excactmedian=139
Loop FastTime(us)=20 SlowTime(us)=80  value=22 median=79 excactmedian=79
Loop FastTime(us)=12 SlowTime(us)=84  value=167 median=79 excactmedian=79

and this with buffer size 5:

Loop FastTime(us)=20 SlowTime(us)=60  value=142 median=102 excactmedian=102
Loop FastTime(us)=16 SlowTime(us)=52  value=79 median=102 excactmedian=102
Loop FastTime(us)=24 SlowTime(us)=52  value=160 median=142 excactmedian=142
Loop FastTime(us)=16 SlowTime(us)=48  value=52 median=142 excactmedian=142
Loop FastTime(us)=16 SlowTime(us)=48  value=3 median=79 excactmedian=79
Loop FastTime(us)=12 SlowTime(us)=48  value=124 median=79 excactmedian=79
Loop FastTime(us)=12 SlowTime(us)=48  value=114 median=114 excactmedian=114
Loop FastTime(us)=20 SlowTime(us)=48  value=32 median=52 excactmedian=52
Loop FastTime(us)=12 SlowTime(us)=52  value=70 median=70 excactmedian=70
Loop FastTime(us)=16 SlowTime(us)=52  value=18 median=70 excactmedian=70

this with buffer size 3:

Loop FastTime(us)=12 SlowTime(us)=28  value=102 median=102 excactmedian=102
Loop FastTime(us)=16 SlowTime(us)=28  value=47 median=80 excactmedian=80
Loop FastTime(us)=12 SlowTime(us)=28  value=102 median=102 excactmedian=102
Loop FastTime(us)=8 SlowTime(us)=28  value=143 median=102 excactmedian=102
Loop FastTime(us)=16 SlowTime(us)=28  value=142 median=142 excactmedian=142
Loop FastTime(us)=12 SlowTime(us)=28  value=79 median=142 excactmedian=142
Loop FastTime(us)=12 SlowTime(us)=28  value=160 median=142 excactmedian=142

even with buffer size 1 there is a slight advantage: :-}}

Loop FastTime(us)=8 SlowTime(us)=12  value=143 median=143 excactmedian=143
Loop FastTime(us)=12 SlowTime(us)=16  value=142 median=142 excactmedian=142
Loop FastTime(us)=8 SlowTime(us)=16  value=79 median=79 excactmedian=79
Loop FastTime(us)=12 SlowTime(us)=16  value=160 median=160 excactmedian=160
Loop FastTime(us)=8 SlowTime(us)=12  value=52 median=52 excactmedian=52

the point is, that your algorithm takes O(n²) with full filled buffer (for add(); getMedian()), if i understand it right. My algorithm needs O(n) (ahem, anything between O(1) and O(2*n) in comparison)

for smaller buffers the difference is academic, but for my project I use several instances with n=33, where taking more than 1ms per median puzzled me a little bit :-))

-rk

Thanks for the analysis,
Indeed your algorithm is O(n) mine is O(n^2).

I wonder if it is possible to get to O(log(n)) ?

  • linked list
  • skiplist
  • tree (balanced)
  • heap

...which is fairly easy to implement with an array.

I have used the code of rkail and it works great :slight_smile:
Is there any reason why it is not available on the playground?
Best regards
Jantje

The playground article points to this thread, I will add a note (+link) that there is a substantial faster version.

update: done

I must have missed the link to this thread :slight_smile:
Thanks for the update
Jantje

I was looking for a way to average degrees for a project I'm working on and didn't want to use vector addition. So, I've implemented a median function in the RunningMedian library for compass readings to compensate for hassles of degree wraps at 359-0. This is based a previous version, not the current version of the library. Feel free to incorporate it into your library for others to use, if you wish.

/* function by WethaGuy to allow determination of median of a set of degrees, accounting for the 359-0 wrap */
int RunningMedian::getMedianDegrees()
{
	if (_cnt > 0){
        int index = 0;
		sort();

	/*check for valid compass value 0-359*/
        if (_as[_cnt-1] > 359 || _as[0] < 0){return NAN;}

	/*check 359-0 wrap*/
	if (_as[_cnt-1]-_as[0] > 180){

             /*find the index value of wrap*/
            while(_as[index+1]-_as[index] < 180){
                index++;
            }

            /* index value is less than the center of the list, the median is the sum of the index and the center of the list plus one (to account for odd number total list size) */
            if (index < (_cnt/2)){
            	return _as[(_cnt/2) + index +1];

            /*index value is not less than center of list, the median is the difference between center of the list and the index */
            }else{
            	return _as[index -(_cnt/2)];
            }
         }
	     /* no 360-0 wrapping needed */
	     return _as[_cnt/2];
	}
	return NAN;
}

edited code to correct error, 8/20/13.