Sorting data before averaging

Hi all!

I'm new here and trying to figure out the most efficient way to sort data in my sketch,
any help would be appreciated.

Before giving a small back story just take note that I'm not a professional programmer by any means.

So I'm building a heart rate sensor and trying to sort the data to get the most accurate BPM reading.
The code I'm running is a slightly modified example code from the MAX3010 sensor.
Every 2 seconds I get a new average reading (int AVG),
At the moment I'm taking 30 of these readings adding them together and then dividing them to figure out the mean.

Sum = AVG + Sum ;

Sum/30 = "mean"

The issue I'm having is that over the minute reading period i can get a few high values that are false readings,
I'm trying to collect the top 5 readings and the lowest 5 readings leaving me with 20 middle values and then getting the mean of those 20 values.

I'm not experienced with arrays but am looking for the "best practice" method for sorting this data.

Thanks in advance!

I can't quite put my finger on it, but taking the average of a series of averages sounds wrong

Would appreciate an explanation,

What i have observed about the readings is that if the sensor bounces around or slips slightly then the reading is significantly lower or higher than accurate reading, I'm trying to throw these values away and unfortunately i cant just add a cap such as

if AVG < (140BPM) {
Sum = AVG + Sum;
}

A Google search returns many results such as arithmetic - Is the average of the averages equal to the average of all the numbers originally averaged? - Mathematics Stack Exchange

Instead of bunching everything together on a rolling average, also store and evaluate individual readings. This allows you to ignore individual readings that are too high or too low. You could implement a form of circular queue for this, which allows you to evaluate the most recent series of n measurements in whatever way you desire.

In regards to the average of the average i wouldnt calculate the average twice only once, but by thworing away the high readings that average would be more accurate??

example if the BPM set looked like this

15,15,34,65,65,66,67,69,64,115,150,135

The idea is to delete the 3 or more values at high and low end of scale so that the average is more accurate the mean of the above set is 71

But if i delete the bottom 3 and top 3 the new calculated mean is 66 which is more accurate??

rather than collect some number of samples, calculate average and display them, i suggest using leaky integration (low pass filter) to update an average after each sample

avg += (samp - avg) * K;

where K is 1 / N.

for a step input, avg will reach ~95% of the input after 3N iterations.

avg should be initialized to the first sample value

you can use median filtering to potentially filter out "false"readings.

median filtering requires soring the last N samples and picking the middle (median) value. perhaps this is what you're thinking of by sorting.

median filtering is often used to remove salt & pepper noise from images.

median followed by a short low-pass filter can have a significant effect

Sounds like a good solution,

Any idea example code I could try ?

code below runs on my laptop. it demonstrates median and linear (LP) filtering. test data is random values imposed on a constant value of 70.

the plot show the test data (gray), median filtered test data (red), linear filtered test data (yellow) and combined avg(median) (green)

below are values

main: samps
       87,    65,    84,    85,    91,    55,    62,    83,    59,    73,
       69,    76,    63,    71,    93,    91,    77,    81,    52,    75,
       46,    57,    52,    85,    53,    65,    51,    50,    95,    56,
       71,    87,    76,    60,    77,    71,    70,    94,    60,    84,
       71,    83,    65,    90,    59,    63,    85,    91,    48,    92,
       71,    49,    55,    78,    90,    62,    48,    46,    68,    48,
       57,    94,    90,    88,    58,    72,    64,    83,    71,    78,
       72,    47,    67,    92,    92,    81,    59,    82,    77,    63,
       79,    53,    67,    89,    86,    62,    56,    90,    63,    79,
       93,    74,    78,    88,    67,    91,    65,    86,    79,    91,
filter: median
        0,    65,    84,    84,    85,    85,    62,    62,    62,    73,
       69,    73,    69,    71,    71,    91,    91,    81,    77,    75,
       52,    57,    52,    57,    53,    65,    53,    51,    51,    56,
       71,    71,    76,    76,    76,    71,    71,    71,    70,    84,
       71,    83,    71,    83,    65,    63,    63,    85,    85,    91,
       71,    71,    55,    55,    78,    78,    62,    48,    48,    48,
       57,    57,    90,    90,    88,    72,    64,    72,    71,    78,
       72,    72,    67,    67,    92,    92,    81,    81,    77,    77,
       77,    63,    67,    67,    86,    86,    62,    62,    63,    79,
       79,    79,    78,    78,    78,    88,    67,    86,    79,    86,
filter: avg
       87,    81,    82,    82,    84,    77,    73,    75,    71,    72,
       71,    72,    70,    70,    75,    79,    79,    79,    72,    73,
       66,    64,    61,    67,    63,    63,    60,    58,    67,    64,
       66,    71,    72,    69,    71,    71,    70,    76,    72,    75,
       74,    76,    73,    77,    73,    70,    74,    78,    70,    76,
       74,    68,    65,    68,    73,    70,    65,    60,    62,    58,
       58,    67,    72,    76,    72,    72,    70,    73,    72,    74,
       73,    66,    66,    73,    77,    78,    73,    75,    76,    72,
       74,    69,    68,    73,    76,    73,    68,    74,    71,    73,
       78,    77,    77,    80,    76,    80,    76,    78,    78,    81,
filter: avg (median)
       83,    84,    84,    84,    84,    84,    78,    74,    71,    71,
       71,    71,    70,    70,    70,    75,    79,    80,    79,    78,
       71,    67,    63,    62,    59,    61,    59,    57,    55,    55,
       59,    62,    65,    68,    70,    70,    70,    70,    70,    73,
       73,    75,    74,    76,    73,    71,    69,    73,    76,    79,
       77,    75,    70,    66,    69,    71,    69,    63,    59,    56,
       56,    56,    65,    71,    75,    74,    72,    72,    71,    73,
       72,    72,    71,    70,    75,    79,    80,    80,    79,    78,
       78,    74,    72,    71,    74,    77,    73,    70,    68,    71,
       73,    74,    75,    76,    76,    79,    76,    78,    78,    80,

i could never come up with a simpler approach to sorting the data for a median filter. inevitably, you want to try larger filters (5, 7, )

#include <stdio.h>
#include <stdlib.h>

void
sort (
    int *vec,
    int  N )
{
    for (unsigned j = N-1; j > 0; j--) {
        for (unsigned i = 0; i < j; i++)  {
            if (vec [i] > vec [i+1])  {
                int t     = vec [i];
                vec [i]   = vec [i+1];
                vec [i+1] = t;
            }
        }
    }
}

int
median (
    int samp )
{
#define M   3
    static int buf [M];
    static int vec [M];

    for (unsigned m = 1; M > m; m++)
        vec [m-1] = buf [m-1] = buf [m];

    vec [M-1] = buf [M-1] = samp;

    sort (vec, M);
    return vec [M / 2];
}

int
avg (
    int samp )
{
    static float _avg = 0;

    if (0 == _avg)  {
        _avg = samp;
        return _avg;
    }

    const float K = 1.0 / 4;
    return _avg += (samp - _avg) * K;
}


// -----------------------------------------------------------------------------
void
gen (void)
{
    printf ("int samps [] = {");
    for (unsigned m = 0; 100 > m; m++)  {
        if (! (m % 8))
            printf ("\n   ");
        printf (" %5.0f,", 70 + (50.0 * random () / RAND_MAX) - 25);
    }
    printf ("\n};\n");

    printf ("#define N_SAMP     (sizeof(samps)/sizeof(int))\n");
}

// -----------------------------------------------------------------------------
int samps [] = {
       87,    65,    84,    85,    91,    55,    62,    83,
       59,    73,    69,    76,    63,    71,    93,    91,
       77,    81,    52,    75,    46,    57,    52,    85,
       53,    65,    51,    50,    95,    56,    71,    87,
       76,    60,    77,    71,    70,    94,    60,    84,
       71,    83,    65,    90,    59,    63,    85,    91,
       48,    92,    71,    49,    55,    78,    90,    62,
       48,    46,    68,    48,    57,    94,    90,    88,
       58,    72,    64,    83,    71,    78,    72,    47,
       67,    92,    92,    81,    59,    82,    77,    63,
       79,    53,    67,    89,    86,    62,    56,    90,
       63,    79,    93,    74,    78,    88,    67,    91,
       65,    86,    79,    91,
};
#define N_SAMP     (sizeof(samps)/sizeof(int))

// -----------------------------------------------------------------------------
void
filter (void)
{
#define L   10
    printf ("%s: samps");
    for (unsigned m = 0; N_SAMP > m; m++)  {
        if (! (m % L))
            printf ("\n   ");
        printf (" %5d,", samps [m]);
    }
    printf ("\n");

    printf ("%s: median", __func__);
    for (unsigned m = 0; N_SAMP > m; m++)  {
        if (! (m % L))
            printf ("\n   ");
        printf (" %5d,", median (samps [m]));
    }
    printf ("\n");

    printf ("%s: avg", __func__);
    for (unsigned m = 0; N_SAMP > m; m++)  {
        if (! (m % L))
            printf ("\n   ");
        printf (" %5d,", avg (samps [m]));
    }
    printf ("\n");

    printf ("%s: avg (median)", __func__);
    for (unsigned m = 0; N_SAMP > m; m++)  {
        if (! (m % L))
            printf ("\n   ");
        printf (" %5d,", avg (median (samps [m])));
    }
    printf ("\n");
}

// -----------------------------------------------------------------------------
void
xgraph (void)
{
#define L   10
    printf ("# %s: samps");
    printf ("\ncolor=%s\nnext", "dark-gray");
    for (unsigned m = 0; N_SAMP > m; m++)  {
        printf (" %4d %5d\n", m, samps [m]);
    }
    printf ("\n");

    printf ("thickness = 1.5\n");

    printf ("# %s: median", __func__);
    printf ("\ncolor=%s\nnext", "red");
    for (unsigned m = 0; N_SAMP > m; m++)  {
        printf (" %4d %5d\n", m, median (samps [m]));
    }
    printf ("\n");

    printf ("# %s: avg", __func__);
    printf ("\ncolor=%s\nnext", "yellow");
    for (unsigned m = 0; N_SAMP > m; m++)  {
        printf (" %4d %5d\n", m, avg (samps [m]));
    }
    printf ("\n");

    printf ("# %s: avg (median)", __func__);
    printf ("\ncolor=%s\nnext", "green");
    for (unsigned m = 0; N_SAMP > m; m++)  {
        printf (" %d %5d\n", m, avg (median (samps [m])));
    }
    printf ("\n");
}

// -----------------------------------------------------------------------------
int
main ()
{
#if 1
    filter ();
#elif 1
    xgraph ();
#else
    gen ();
#endif

    return 0;
}

Thanks heaps im trying my best to understand the code but its a little above my paygrade,

here is a portion of the running code

#include <Wire.h>
#include "MAX30105.h"

#include "heartRate.h"

MAX30105 particleSensor;

const byte RATE_SIZE = 4; //Increase this for more averaging. 4 is good.
byte rates[RATE_SIZE]; //Array of heart rates
byte rateSpot = 0;
long lastBeat = 0; //Time at which the last beat occurred
long publishInterval = 0;

float beatsPerMinute;
int beatAvg;

void setup()
{
  Serial.begin(115200);
  Serial.println("Initializing...");

  // Initialize sensor
  if (!particleSensor.begin(Wire, I2C_SPEED_FAST)) //Use default I2C port, 400kHz speed
  {
    Serial.println("MAX30105 was not found.");
    while (1);
  }

  particleSensor.setup(); //Configure sensor with default settings
  particleSensor.setPulseAmplitudeRed(0x0A); //Turn Red LED to low to indicate sensor is running
  particleSensor.setPulseAmplitudeGreen(0); //Turn off Green LED
}

void loop()
{
  long irValue = particleSensor.getIR();

  if (checkForBeat(irValue) == true)
  {
    //We sensed a beat!
    long delta = millis() - lastBeat;
    lastBeat = millis();

    beatsPerMinute = 60 / (delta / 1000.0);

    if (beatsPerMinute < 255 && beatsPerMinute > 20)
    {
      rates[rateSpot++] = (byte)beatsPerMinute; //Store this reading in the array
      rateSpot %= RATE_SIZE; //Wrap variable

      //Take average of readings
      beatAvg = 0;
      for (byte x = 0 ; x < RATE_SIZE ; x++)
        beatAvg += rates[x];
      beatAvg /= RATE_SIZE;
    }
  }
 if (millis() > (publishInterval + 2000) && (irValue > 2000)) {


PUBLISH READING / APPLY FILTER

  publishInterval = millis();
 }

where would i add your filtering recomendation?

beatAvg is the filtered sensor reading

@jcphillips Why do users choose to ignore the advice on posting code ?

The easier you make it to read and copy your code the more likely it is that you will get help

Please follow the advice given in the link below when posting code , use code tags and post the code here

If you get errors when compiling please copy them from the IDE using the "Copy error messages" button and paste the clipboard here in code tags

this line illustrates how a sample is passed to the filters and the resulting filtered value.

        printf (" %5d,", avg (median (samps [m])));

This topic was automatically closed 120 days after the last reply. New replies are no longer allowed.