Moving averages for two separate integers once an hour: how to conceive the prog

What would be the best programming approach for the following situation:

  1. once an hour a controller (will be an Arduino Uno, a Pro Mini or an ESP8266) will receive following information over the hardware serial:
    a. integer A (value A)
    b. integer B (current time, in minutes)
    c. integer C (the current time, in hours)
    d. a letter designating the source (either X or Y)
  2. the moving average needs to be calculated for the value A, but distinction needs to be made for the source X or Y
  3. the result of these moving averages needs to be made available for each of both sources (X and Y) in 2 new variables. These will be used for further action (irrigation system).
  4. an algorithm is in place to indicate the arrival of a new value; this variable is reset when the new value is being read. There will be no conflict between the two sources because the timing of reception of these values will be about half an hour apart.

Example:
source X sends the values 100, 110, 115 consecutively, per hour, in the first 3 hours of operation. First moving average must be 100, at the second hour: average of 100 and 110 = 105, after the third hour: average of 105 and 115 = 110.
The fourth hour the value 120 is sent, average = 115, the next hour value 125 is sent, average = 120.

Exception: if the difference between the new incoming value and the previous average is higher than a certain percentage (suppose for the example's sake 10%) then a flag must be set until the next value comes in that deviates less than this percentage from the last calculated moving average.

Summary of project: this serves to determine the irrigation needs for a system using remote soil moisture sensors (this is a private project for my gardening needs).

are you attempting something along the lines of

void setup() {
  // put your setup code here, to run once:
  Serial.begin(9600);
  Serial.setTimeout(10000000ul);
}

void loop() {
  static int average1=0;
  int new1=Serial.parseInt();
  if(average1==0) average1=new1;    // first average
  else
      // ignore value if difference > 10% else calculate average
      if(abs(new1 - average1) <= average1/10)
             average1=(average1 +new1)/2;
  Serial.print("average 1 = ");
  Serial.println(average1);
}

if I give it data "100 110 115 167 145 120 125" it prints

average 1 = 100
average 1 = 105
average 1 = 110
average 1 = 110
average 1 = 110
average 1 = 115
average 1 = 120

@Horace, what does Serial.parseInt() return when there's no integer available on Serial? The original spec said once per hour, so there's a few thousand times that your program will be wrong.

@Brice, That's actually a pretty good specification. It sounds more like homework instead of a personal project.

You've specified an exponential filter, not a moving average. But it's easy enough to do, as Horace showed.

The 10% exception will get you into trouble. If it rains during an hour then the measured moisture may move by more than the threshold. From that point on, the device will no longer accept any updates until several days later when the actual moisture gets back to the 10% threshold. You need to consider a way out of this trap. Maybe if you get two consecutive values which are within 10% of each other, then you start using the new values. Personally, I would use a real moving average which actually stores a little history and then apply some discard rule, like average 10 readings out of the last 12 by throwing away the highest and lowest values.

I suggest you start with Serial Input Basics Updated. Once you get the serial input working then the rest should be easy.

MorganS:
@Horace, what does Serial.parseInt() return when there's no integer available on Serial? The original spec said once per hour, so there's a few thousand times that your program will be wrong.

I messed the setup() method where I set the timeout
otherwise if parseInt() returns 0 ignore the input (problem if some of the data could be 0?)

MorganS:
@Horace, what does Serial.parseInt() return when there's no integer available on Serial? The original spec said once per hour, so there's a few thousand times that your program will be wrong.

@Brice, That's actually a pretty good specification. It sounds more like homework instead of a personal project.

You've specified an exponential filter, not a moving average. But it's easy enough to do, as Horace showed.

The 10% exception will get you into trouble. If it rains during an hour then the measured moisture may move by more than the threshold. From that point on, the device will no longer accept any updates until several days later when the actual moisture gets back to the 10% threshold. You need to consider a way out of this trap. Maybe if you get two consecutive values which are within 10% of each other, then you start using the new values. Personally, I would use a real moving average which actually stores a little history and then apply some discard rule, like average 10 readings out of the last 12 by throwing away the highest and lowest values.

I suggest you start with Serial Input Basics Updated. Once you get the serial input working then the rest should be easy.

Hi MorganS, thank you for your reply, I had not realised this is not a moving average calculation, I will redo my homework and respond back here with a more appropriate proposal for moving average calculation.

Just a query: should I use an array, or do you have a better way to store the last 4, 5, 10 or whatever number oc values, or a différent way to calculate this?

About Serial Basics from RobinS, I already did my homework there using his truly fantastic piece of work. I have a working piece of software using his input to communicate wireless (HC12) between Arduino's.

About the spec: it is indeed for private use, but I learned that the better the spec and the more concise the question posed, the better the answers are.

To do a proper moving average, you must store the last N elements. An array is the best way to do this.

The specific form of array is called a "circular buffer". You want to avoid moving array elements once they are saved, so once you fill N elements, you need to add the next item. Instead of moving them all down and throwing away element zero, you just move your pointer to zero and then the last N elements are stored in 1 to (N-1) and 0. (Google it for a better explanation.)

There is a mathematical trick to calculating the moving average. You store the total of all elements (in a variable large enough to store a number this big) and then each element added is added to the total. Before you store that in the array, you take the element of the array which is going to be discarded and subtract that from the total. That way the total always contains the sum of all the elements stored in the array and you don't have to cycle over the array to add it up each time your code actually reads the average.

MorganS:
To do a proper moving average, you must store the last N elements. An array is the best way to do this.

The specific form of array is called a "circular buffer". You want to avoid moving array elements once they are saved, so once you fill N elements, you need to add the next item. Instead of moving them all down and throwing away element zero, you just move your pointer to zero and then the last N elements are stored in 1 to (N-1) and 0. (Google it for a better explanation.)

There is a mathematical trick to calculating the moving average. You store the total of all elements (in a variable large enough to store a number this big) and then each element added is added to the total. Before you store that in the array, you take the element of the array which is going to be discarded and subtract that from the total. That way the total always contains the sum of all the elements stored in the array and you don't have to cycle over the array to add it up each time your code actually reads the average.

Hi MorganS, first of all I will not use the above "10% rule", you are right in that it will upset the results and program flow.
The circular buffer (FIFO) uses the stack size to give a weighting to each (new) value. While doing some thinking about this subject I remembered a post (in French: LOCODUINO - Réalisation de centrales DCC avec le logiciel libre DCC++ (1)) about model railway enthusiasts who developed a arduino-based DCC train control system. In that program the current usage is measured (thousands of times a second) and a moving average is calculated using a weighting value for each new measurement.
The actual formula used is:

A = coeff x newValue + (A-1) x (1 - coeff)

where
A-1 = the previously calculated value
newValue = the newly incoming value
A = newly calculated value
coeff = value between 0,00000001 (something very low) and 1

This in effect creates a low-pass filter. For thousands of readings per second a low value would be suitable; for my case with one value per hour I would rather use a value of 0.5 (to dampen the impact of fast changes, ie a rain downpour).

I realise this is not the same as using a circular buffer but the effect I think would not be very different?

I realise this is not the same as using a circular buffer but the effect I think would not be very different?

As always, this depends on what you mean by "very different". How you calculate the moving average, and whether or not it is a simple moving average, a weighted moving average, or an exponentially weighted moving average is a matter of the data and your choice of approach.

As said before, the approach you suggest is an exponentially weighted moving average.

Clearly you can use a circular buffer to generate a simple moving average with equal weights, or a weighted moving average.

The formula you provided is an array free method of calculating the exponential moving average.

You may want to model some data streams to see which approach produces the result which is closest to your intent.

cattledog:
(...)

Clearly you can use a circular buffer to generate a simple moving average with equal weights, or a weighted moving average.

The formula you provided is an array free method of calculating the exponential moving average.

You may want to model some data streams to see which approach produces the result which is closest to your intent.

Thanks for the links cattledog, I will have some research reading to be done.

At this moment I have no idea what approach would be best, since this is a completely experimental project. I do know that changes in moisture sensing results will be very slow, yet also very repeatable and accurate (this is the sensor part), but this is all in my lab here with some flowerpots; no testing in real life garden or vineyard conditions (southern France) yet. I will feedback as the project progresses.

Thanks too for the differentiating in calculation methods as you described.

I would have called that a single-pole IIR filter. The moving average is an example of an FIR filter.

For more than you ever wanted to know, read The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D. The whole book is online there. If you don't want to read the whole thing, start with chapter 15 then skip to chapter 19 to see examples of these two kinds of filters.

For a quick single-pole IIR, I just slap this code into anywhere I need it:

float filter(float inputVal) {
  //implement the IIR filter
  //reference: The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D.
  //  http://www.dspguide.com/ch19/2.htm
  //This is a "single pole" filter. 

  //x is the decay between adjacent samples. Choose any number between 0 and 1 (except 1).
  const float x = 0.86; 
  static float outputVal = 0; //make it static so it's stored for the function to use again the next time it's called
  
  outputVal = outputVal * x + inputVal * (1-x);
  return outputVal;
}

The function has a single static variable which stores the (A-1) value. If you want to filter two things then you need two copies of the function, with different names.

A value of x=0.9 is close but not exactly equivalent to a 10-element moving average.

For more advanced filtering where I care about the frequency response, then I use this free online filter designer which allows you to specify arbitrary cutoff frequencies and really control the filter output in detail.

MorganS:
I would have called that a single-pole IIR filter. The moving average is an example of an FIR filter.

For more than you ever wanted to know, read The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D. The whole book is online there. If you don't want to read the whole thing, start with chapter 15 then skip to chapter 19 to see examples of these two kinds of filters.

Cool, makes me feel 35 years younger :sunglasses: Very good read!

MorganS:
For a quick single-pole IIR, I just slap this code into anywhere I need it:

float filter(float inputVal) {

//implement the IIR filter
  //reference: The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D.
  //  Single Pole Recursive Filters
  //This is a "single pole" filter.

//x is the decay between adjacent samples. Choose any number between 0 and 1 (except 1).
  const float x = 0.86;
  static float outputVal = 0; //make it static so it's stored for the function to use again the next time it's called
 
  outputVal = outputVal * x + inputVal * (1-x);
  return outputVal;
}




The function has a single static variable which stores the (A-1) value. If you want to filter two things then you need two copies of the function, with different names.

A value of x=0.9 is close but not exactly equivalent to a 10-element moving average.

For more advanced filtering where I care about the frequency response, then I use [this free online filter designer](http://t-filter.engineerjs.com/) which allows you to specify arbitrary cutoff frequencies and really control the filter output in detail.

Nice, all this, together with cattledog's input, gives me at least several weeks good reading, thanks MorganS and cattledog!