Dynamically calculate background noise level (to identify louder noises)...

Experts,
I’ve created a basically stock sketch with arduinoFFT, and life is good. But in my effort to recognize any sounds louder than the background, I ran into an interesting problem: When is loud loud enough?

I go through the vReal loop, adding values, calculating the average-even taking out the 2 loudest bins’ values, but this in no way can compensate for additional or fewer noise sources. To illustrate, this is where I am now:

double AlertThreshMultiplier = 2.0;
[...]
  for (uint16_t i = 4; i < (samples>>1); i++)
  {
    // i = bin; abscissa = freq vReal[i] = strength
    abscissa = ((i * 1.0 * samplingFrequency) / samples);
    if (vReal[i] > FirstHighestValue) {
      SecondHighestValue = FirstHighestValue;
      FirstHighestValue = vReal[i];
      FirstHighestFreq = abscissa;
    }
    else if (vReal[i] > SecondHighestValue) {
      SecondHighestValue = vReal[i];
      SecondHighestFreq = abscissa;
    }
    AllValuesTotal += vReal[i]; // for computing background
    AllValuesCount++;
  }
  [...]
    AllValuesTotal = AllValuesTotal - (FirstHighestValue - SecondHighestValue);
    AllValuesCount = AllValuesCount - 2; // take out the 2 outliers above
    AllValuesAverage = AllValuesTotal / AllValuesCount;
  Thresh = AlertThreshMultiplier * AllValuesAverage;
  [...]

So I’m trying to compare the FirstHighestValue to the Thresh variable, with this comparison:

if (FirstHighestValue >= (AlertThreshMultiplier * AllValuesAverage) + 10) {
[...]

-So as you can see, I take the count and values of all elements of the vReal array, and do a simple mean average.
-Then I use an arbitrary multiplier (AlertThreshMultiplier) against that average (in this case 2.0).
-Finally, I add 10 to whatever that value is (as an offset), as the low level signals can be VERY low (.01, .001, etc.) and so multiplication products are too small to be valuable.

You see, the environment could change (get louder, quieter), and I don’t want to upload new code every time they might, or have some kind of manual calibration/pot in the thing. I am hoping I can just use a better algorithm to adjust to background noise.

The idea that 2x background level is fine, until you have very high, or very low background levels. Low signals means darn near anything will trigger it and high signals means a sound would need to be unrealistically high to breach the threshold.

Am I better off just making the threshold X above whatever threshold there is (mean average minus 2 highest signals)?

Thoughts?

TIA!

Assume you have an array of sound levels measured over time.
Lets assume these are all integers 0..255 (to keep array compact)

algorithm 1: take the average sound level as background noise
As the peaks will be much higher, the average will be far above the background noise

simplified example
samples - 10 10 11 9 10 30 10 10 200
sum = 300, average = 33 is way above background level
small peaks is missed.

algorithm 2: take the median sound level as background noise

samples - 10 10 11 9 10 30 10 10 200
sum = 300, median = 10 is in the mid of the background level
the 11 noise will be seen as a signal,
furthermore median calculation needs a sorted array (= relative expensive)

algorithm 3: middle average and median (averian?)

samples - 10 10 11 9 10 30 10 10 200
sum = 300, "averian" = 21 is above the background level
seems to work, but same drawbacks as median

algorithm 4: running average (in dummy code)

for every measurement:
m = measurement();
BGlevel = BGlevel * 80% + m * 20% // %% weights to be tuned
if (m > BGlevel * 1.1 ) we have a signal // 1.1 factor to be tuned.
else background noise

Expect this works well (some tuning needed), low memory profile and relative fast;

My sound activated lighting effects use something like that... But, I'm not using FFT, just "loudness".

I save a "reading" once per second in a 20 element (i.e. 20-second) [u]moving average array[/u]. And, every time I save a new reading I calculate the new average of the array and I find the peak.*

Then depending on the effect, I use the average or peak as a reference/threshold.

It works like an automatic-sensitivity control and it automatically adjusts for quiet & loud songs, or when I change the volume control.

In your application, you might want to use the lowest or highest bin, or 20% of the average, or something else that works for you, etc.

  • This is not necessarily the peak, since it's only saved once per second. It's just the peak value in the array.