NewPing Library: HC-SR04, SRF05, SRF06, DYP-ME007, Parallax PING))) - v1.7

Okay, I believe I've got the new digital filtering method down to as lean as it's going to get with all the kinks worked out. For the coders out there, here's the method:

unsigned int NewPing::ping_ave(uint8_t it) {
	int uS[it], last, median;
	uint8_t i, j, cnt = 0;
	unsigned long total = 0;
	uS[0] = last = ping();   // Send first ping (done here to assist insertion sort).
	for (i = 1; i < it; i++) {                                                    // Ping iterations.
		if (last != NO_ECHO) cnt++;                                               // Array median tracking.
		delay(PING_AVE_DELAY - ((last == NO_ECHO ? _maxEchoTime : last) / 1000)); // Delay between pings.
		last = ping();                                                            // Send ping.
		for (j = i; j > 0 && uS[j - 1] < last; j--) uS[j] = uS[j - 1];            // Insertion sort (big to small).
		uS[j] = last;                                                             // Add last ping to array in correct order.
	}
	if (last != NO_ECHO) cnt++; // Array median tracking.
	median = uS[cnt >> 1];      // Find median.
	cnt = 0;                    // Reset counter.
	for (i = 0; i < it; i++) {                           // Loop through results.
		if (abs(median - uS[i]) < US_ROUNDTRIP_CM * 2) { // Exclude values outside +/-2cm range (digital noise).
			total += uS[i];                              // Building truncated mean.
			cnt++;                                       // Increment counter.
		}
	}
	return (total / cnt); // Return the ping distance mean minus digital noise (truncated mean).
}

To actually implement it you would need to define PING_AVE_DELAY to 29 and add the method to the class (I set the iterations [it] to default to 5). I know not everyone is adept at doing this, but this is a preview for those that are. Also, if there's any statistical gurus out there that want to poke holes in my code, feel free (statistics is not my thing).

The method uses two passes to calculate a truncated mean with outliner pings removed (digital noise). The first pass collects the pings, keeps track of successful pings, and sorts on-line using the very simple and fast insertion sort. Before the second pass it then calculates the median ping (with the unsuccessful pings filtered out).

Once the median value is found, it then does another pass to calculate the truncated mean. Outliner pings are those outside +/-2cm (these are not calculated as part of the mean). Because insertion sort is done on-line instead of in another pass (like a typical bubble sort), even the max of 255 ping iterations can be done without any perceived slow-down (although 255 is a little excessive, 3-9 is typically plenty). Insertion sort is very fast with a small quantity of values to sort. So fast that highly complex sort algorithms like quicksort resort to insertion sort when there's only a few values to sort (like in this case). Also, insertion sort has a very small stack footprint (only one variable) and extremely small code size (because it's so simple, 2 lines of code). It's really the only sort to use on Arduino.

In laymen terms, the new method does a user-specified number of pings as fast as it can (29ms per ping) ignores out of range pings, sorts the ping values, gets the middle distance ping value, filters out pings outside +/-2cm of the middle distance ping, and returns the average of these pings in microseconds.

unsigned int us = sonar.ping_ave(); // Defaults to 5 iterations, returns the truncated mean in microseconds.
unsigned int us = sonar.ping_ave(3); // Do 3 iterations, returns the truncated mean in microseconds.
unsigned int cm = sonar.convert_cm(sonar.ping_ave(9)); // Do 9 iterations and get the truncated mean in cm.

A few explanations about the method specific to the programming. First, some things may not appear as clean as they could be. In many cases this is done to make the code smaller once compiled. By making these changes around 100 bytes was saved on the final compiled code. I believe this is a fair compromise with the limited memory in the ATmega. Secondly, there are some items that may not be statistically perfect. An example is the median when there's an even number of values. The code only picks one median value, even if there technically should be the average of two. This again was done to reduce compiled code size and it also doesn't make much difference as there's a lot of fuzzy logic in a truncated mean anyway.

Basically, the function could be re-worked to be more clear, logical, and maybe slightly more statistically accurate. But, that's technically how the function started. I tweaked it over the course of a few days to try and squeeze as many bytes as I could out of it.

With that said, if there's something wrong, a suggestion, or another performance tweak that doesn't make the compiled code larger, please let me know (that's one of the reasons I'm posting the method before it's released).

Tim