How to detect a peak and low in noisy data..

Hi,

I have input from a sensor that is periodically sampled.
The data is fuzzy or noisy, where it could be as noisy as 5% of the evaluation range.
In some cases the noise could be 20% of the evaluation range.
The evaluation range is the upper blue line to purple low.
Please see the picture..
The time shown is just an example, could be 10 seconds to 20 minutes.
After initiation the data value will climb to a peak, and the slowly drop a small amount.
After a time it will rise again. The times are not known.
I am trying to find the value at point A and B, and then accurately stop at C.
I am asking if anyone knows a way to "smooth" or remove the noise.
I can take many samples per second if that helps..

I posted this here because it is not a program yet, but maybe this should be in the software section.

Thanks for the thoughts.

Taking a few samples (5 to 1000) with analogRead() and using the average will filter out high frequency electronic noise.
That is the first thing to do.

Then you can use a simple low-pass filter. It is very easy with float variables.
Or a more mathematical approach with a walking average filter. There is a nice way to do that with integers.
There are of course hundreds or thousands ways to filter data.

Finding first the maximum 'A' and after that the minimum 'B', means you need a variable that tells where the sketch is at that moment.

Please tell us more: Is the data the height in meters ? Is that data a 'float' or a 'long' ? How is it sampled ? With analogRead() or do you get the data via a Serial connection ? Which Arduino board do you use ? What sensor is it ?

Nice drawing! But from your comments, I gather this is a rather ideal example of the range of behaviors.

You will probably need to carefully examine many, many examples, and try several different smoothing or curve fitting algorithms in order to find a pattern in the data that can reliably allow such key data points to be extracted.

The algorithm you choose should associate a reliability factor or error bar with each data point. For example, in the above example, the local minimum at B will have a wide error bar, say located at "B" +/- 5 seconds. That error bar will grow as the data become noisier.

Google "peak search algorithm" or "minimum search" to see many approaches.

To find a minima you have to specify a range. How do you establish when that begins and ends?

I have input from a sensor that is periodically sampled.

Can you tell us what the sensor is, and how often it is sampled?

Can you provide actual sample data so we can try some noise reduction techniques?

I am trying to find the value at point A and B,

the x or y value?

an old saying is crap in = crap out.

The resolution, quality and stability of the sensor is key.
that is why they make yard/meter sticks and micrometers.

I have seen people buy a thing with a full span of 0 to 1,000, then try to read 0-2.

ever try to measure the thickness of a sheet of paper with a meter stick ?

===============

Another idea is to use software designed for data plotting.

for my raw data sets, I typically use a spreadsheet and try some of the simple options, running average of last 5 readings or some such. that allows you to do one column of last 5, last 20, last 100 and then graph them to see what the graph look like.

aarg:
To find a minima you have to specify a range. How do you establish when that begins and ends?

So this is a dynamic running process, some time? After point A and before C the sample input will be at a minimum for this process.

jremington:
Nice drawing! But from your comments, I gather this is a rather ideal example of the range of behaviors.

[Snip]
The algorithm you choose should associate a reliability factor or error bar with each data point. For example, in the above example, the local minimum at B will have a wide error bar, say located at "B" +/- 5 seconds. That error bar will grow as the data become noisier.

Google "peak search algorithm" or "minimum search" to see many approaches.

Thanks, I will do some searching. The time of the key points is not as important as finding the magnatude at Point A and B. Given these points may be somewhat close in magnatude, if I just detecting any single maximum/minimum noise could introduce a significant change.

johnerrington:
Can you tell us what the sensor is, and how often it is sampled?

Can you provide actual sample data so we can try some noise reduction techniques?
the x or y value?

I dont have real data from the sensor. This sensor is a low pressure, diferential sensor serial output. 24 bit, shifted to the aproximate range shown in the picture. The ultimate final application is leak detection using low pressures. Thermal effects of charging the test volume, ambiant temperature, etc. cause pressure changes. This sample curve may not actually represent a real test, but essentially encompases the dats anaylisis need.

I think I would try using a second-order filter algorithm. This would smooth out the noise in the data, and give not only an estimate of the current value, but the rate of change of the current value, ie. the slope of the line. Your points A and B seem to be "turning points", where the slope changes from positive to negative (A) and negative back to positive (B).

Koepel:
....
Or a more mathematical approach with a walking average filter. There is a nice way to do that with integers.

That sounds interesting. What is a nice way with integer?

unsigned int smooth(unsigned int newVal) {
  static unsigned int oldVal = 0;
  unsigned long sum;
    //  optimize sum = (oldVal * 3 + newVal) / 4;
    sum = ( (oldVal << 1) + oldVal + newVal) >> 2;
  }
  oldVal = sum;
  return oldVal;
}
...
  // example use
  smoothedValue = smooth( analogRead(A0) );

...

But this filter takes some time to "settle" on the average value. So the first output value(s) could be mistaken for a minimum. For your application, you might "prime" it by somehow setting oldVal to the value of the very first sample.

You cant really design a filter without knowing the power spectral density of the noise signal, and the same for the signal you wish to retain. It LOOKS as if the "noise" is at a higher frequency then the signal but until you have some actual data we cant be sure.

I dont have real data from the sensor. This sensor is a low pressure, diferential sensor serial output. 24 bit, shifted to the aproximate range shown in the picture. The ultimate final application is leak detection using low pressures. Thermal effects of charging the test volume, ambiant temperature, etc. cause pressure changes.

I think you need to run a sample test and get some real data - preferably woth and withouta signal so the noise can be sepearted out.

Yes, time to collect real data. A lot of it.

Then you can test and decide from the hundreds of approaches how to make sense out of it.

aarg:

unsigned int smooth(unsigned int newVal) {

static unsigned int oldVal = 0;
 unsigned long sum;
   //  optimize sum = (oldVal * 3 + newVal) / 4;
   sum = ( (oldVal << 1) + oldVal + newVal) >> 2;
 }
 oldVal = sum;
 return oldVal;
}
...
 // example use
 smoothedValue = smooth( analogRead(A0) );

...



But this filter takes some time to "settle" on the average value. So the first output value(s) could be mistaken for a minimum. For your application, you might "prime" it by somehow setting oldVal to the value of the very first sample.

Another way of filtering out the noise could be a sliding window.

uint16_t window[8];
uint8_t windowPosition = 0;

void acquireSample() {
	// this function should be called at a regular interval.
	window[(windowPosition++ & 0b111)] = analogRead(thePin);
}

To get a reading the average of the window is calculated. This also allows for calculating an uncertainty value.

Qsilverrdc:
I dont have real data from the sensor. This sensor is a low pressure, diferential sensor serial output. 24 bit, shifted to the aproximate range shown in the picture. The ultimate final application is leak detection using low pressures. Thermal effects of charging the test volume, ambiant temperature, etc. cause pressure changes. This sample curve may not actually represent a real test, but essentially encompases the dats anaylisis need.

Hmmm. If you don't have real data, how did you determine it is noisy?
Paul

Paul_KD7HB:
Hmmm. If you don't have real data, how did you determine it is noisy?
Paul

I have experimented with the sensor and its outputs at various pressures. It has a little noise.
I now read the device 20 times to get an average point. But this is static. In practice the data will be changing as it is averaged. I was hoping that I could improve on it.

johnerrington:
I think you need to run a sample test and get some real data - preferably woth and withouta signal so the noise can be sepearted out.

my speal cheaker is down for the moment, I was hoping to have a go at sepearting out some of the noise.

Qsilverrdc:
I have experimented with the sensor and its outputs at various pressures. It has a little noise.
I now read the device 20 times to get an average point. But this is static. In practice the data will be changing as it is averaged. I was hoping that I could improve on it.

Just remember, when you low pass filter a signal, you are removing information. Try not to remove too much. :slight_smile:

Qsilverrdc:
I have experimented with the sensor and its outputs at various pressures. It has a little noise.
I now read the device 20 times to get an average point. But this is static. In practice the data will be changing as it is averaged. I was hoping that I could improve on it.

So, in conclusion, you don't really know if there is "noise" or what form it takes, nor if it is momentary or continuous or cyclical. Lets just assume it is ALL valid data until we know different.
Paul