Go Down

Topic: Reliable Frequency Measurment Using Autocorrelation (Read 7 times) previous topic - next topic

akellyirl

Accurate Frequency Detection is important for many projects such as Guitar/Piano Tuners, Vibration Analyzers, Heartrate Monitors, MEMs Sensor Analysis and Laboratory Instruments.

There have been many fine examples of projects that try to solve this problem, for example: Arduino Frequency Detection by amandaghassaei on Instructables and Arduino Frequency Counter Library, but they all use Time Domain techniques; analyzing the signal for features such as : Zero-Crossings, Peak Detection, Slope Detection etc..

Waveforms with very strong harmonic content makes the fundamental frequency undiscernable and signals like that will not be identifiable using Time Domain techniques.

I've written an instructable on how to use Autocorrelation to reliably Identify the frequency of signals in the presence of noise and strong harmonic content.

Actually is surprisingly easy.

Here it is.
http://www.instructables.com/id/Reliable-Frequency-Detection-Using-DSP-Techniques/

robtillaart

Hi Akellyirl,

Nice piece of work, I took the freedom to refactor and optimize your code.

+ created function:  float autoCorrelateFrequency(char * sample, int len, float sampleFreq)
+ made datatype signed char, to remove the -128 from inner loop formula
   (preprocess the data array once)
+ removed the division by 256 from the inner loop
+ added endstate!=3 condition in the outer loop to stop processing asap
+ refactor

The above changes improved the performance from 8613 -> 139 millis to determine the 259.41 Hz in the sample signal.
Given that the UNO can make 8000 10bit  samples / second it should be possible to determine frequencies up to 2Khz
(4000 samples) in less than a second. 

(sample.h file is attached)
Code: [Select]
//    FILE: frequencyDSP.ino
//  AUTHOR: rob tillaart (org version akellyirl)
// VERSION: 0.1.03
// PURPOSE: frequency analysis using DSP
//    DATE: 2013-10-27
//     URL: http://www.instructables.com/id/Reliable-Frequency-Detection-Using-DSP-Techniques

#include "sample.h"

void setup()
{
  Serial.begin(115200);
  Serial.println("FrequencyDSP");
  Serial.println("------------");

  uint32_t start = millis();
  float fr = autoCorrelateFrequency(rawData, sizeof(rawData), 22050);
  uint32_t stop = millis();

  Serial.print("freq:\t");
  Serial.println(fr);
  Serial.print("time:\t");
  Serial.println(stop - start);
}

void loop()
{
}

float autoCorrelateFrequency(char * sample, int len, float sampleFreq)
{
  long sum = 0;
  long sum_old = 0;
  int thresh = 0;
  byte pd_state = 0;
  int period = 0;  // 0 results in inf

  // Autocorrelation
  for (int i=0; i < len && (pd_state != 3); i++)
  {
    sum_old = sum;
    sum = 0;

    for (int k=0; k <len-i; k++)
    {
      sum += (rawData[k]) * (rawData[k+i]);
    }
    sum /= 256;

    // Peak Detect State Machine
    // 0: initial, set threshold
    // 1: detect positive slope
    // 2: detect negative slope
    // 3: peak found
    if (pd_state == 0)
    {
      thresh = sum / 2;
      pd_state = 1;
    }
    else if (pd_state == 1 && (sum > thresh) && (sum - sum_old) > 0)
    {
      pd_state = 2;
    }
    else if (pd_state == 2 && (sum - sum_old) <= 0)
    {
      period = i;
      pd_state = 3;
    }
  }

  return sampleFreq/period;
}


Please give it a try
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

curro92

hi,
in the example as input file is used sample.h raw. How would you read samples from an analog input of Arduino?

Thanks

MarkT

Read some samples on a regular basis, shift them right two bits and store
in the char * that autoCorrelateFrequency () expects?
[ I won't respond to messages, use the forum please ]

jremington

#4
Sep 23, 2014, 01:30 am Last Edit: Sep 23, 2014, 03:05 am by jremington Reason: 1
A problem with the code as posted is that the autocorrelation sum depends on the number of samples correlated, which become smaller as the index (i) increases. That is, the autocorrelation sum is not properly scaled.

I fixed that, and also eliminated the condition that the program stop as soon as it found a large peak.  A plot of the results shows a number of large peaks in the autocorrelation function, so the result is less clean than the OP claims. (The plot becomes noisier toward the right end, because the number of samples included in the sum is small and the error increases).

The inner loop now looks like this (should probably divide by (k-1), but that is a minor detail)
Code: [Select]
   for (k=0; k <len-i; k++)
   {
     sum += (rawData[k]) * (rawData[k+i]);
   }
   sum /= k ;
   fprintf(fp,"%ld\n",sum);


Note: normally the average value of the data points is subtracted from each term, but here, the average is not large (-4).
"It seems to run on some form of electricity"

robtillaart

@jremington
can you post your complete version?
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

jremington

See below. This simplified version is for Code::Blocks on a PC, because I wanted to write out a file of the results for plotting and inspection.

Note: another problem with the original code is that the peak detect algorithm detects only the first peak (not necessarily the largest). I did not bother to fix that.

I know this post is old, and several thousand people have looked at it without commenting, so there probably isn't much interest in the technique.

Code: [Select]
#include <stdio.h>
#include <stdlib.h>
#include "sample.h"

float autoCorrelateFrequency(char * sample, int len, float sampleFreq)
{
  long sum = 0;
  long sum_old = 0;
  int thresh = 0,i,k;
  char pd_state = 0;
  int period = 0;  // 0 results in inf
    FILE *fp;
    fp=fopen("test.txt", "w");
  // Autocorrelation
  for (i=0; i < len; i++)
  {
    fprintf(fp,"%d,%d,",i,rawData[i]);
    sum_old = sum;
    sum = 0;

    for (k=0; k <len-i; k++)
    {
      sum += (rawData[k]) * (rawData[k+i]);
    }
    sum /= k ;  //this should be k-1, but if so the last point will result in divide by zero
    fprintf(fp,"%ld\n",sum);
    // Peak Detect State Machine
    // 0: initial, set threshold
    // 1: detect positive slope
    // 2: detect negative slope
    // 3: peak found
    if (pd_state == 0)
    {
      thresh = sum / 2;
      pd_state = 1;
    }
    else if (pd_state == 1 && (sum > thresh) && (sum - sum_old) > 0)
    {
      pd_state = 2;
    }
    else if (pd_state == 2 && (sum - sum_old) <= 0)
    {
      period = i;
      pd_state = 3;
    }
  }
    fclose(fp);
  return sampleFreq/period;
}

int main()
{
  float fr = autoCorrelateFrequency(rawData, sizeof(rawData), 22050);

  printf("freq: %f\n",fr);

    return 0;
}
"It seems to run on some form of electricity"

curro92

#7
Sep 29, 2014, 08:30 am Last Edit: Sep 29, 2014, 08:32 am by curro92 Reason: 1
hi,
something is wrong:
I have compiled the code of robtillaart, I used sample.h of instructables where I added a global variable
const float sample_freq = 22050.0;
(so every sample.h can have different sample rate)

I get the freq = 5512.5Hz result, but the frequency of instructables sample.h  is 260Hz  (with their code I get 259.4Hz)

With the source code of jremington I get the same result.

What am I doing wrong?

robtillaart

please post the output of the sketch you attached.

which board are you using?
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

curro92

The output is:
Quote
samples: 970, sample_freq: 22050.000000
freq: 5512.500000


I've compiled on XP as
gcc  frequencyDSP_robtillaart.c  -o frequencyDSP_robtillaart.exe

And I have compiled in Linux Mint 17 as

gcc  frequencyDSP_robtillaart.c   -o frequencyDSP_robtillaart

The output is the same in both cases.


curro92


With this other instructables code the output is ok:

len: 970 sample_rate: 22050.000000
freq: 259.411774

robtillaart

could be related to size of datatypes?  32 bit - 16 bit
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

KenF

Guys.  This is great work.
I would suggest that the next step would be..
Having found the dominant frequency, subtract it from the raw data.  Then go through the whole process again to find the next most dominant frequency. 
This can be repeated until the rawdata minus all of the frequencies found = 0. That would be a really interesting toy.  Possibly even able to identify someone by their voice.

jremington

Just for completeness, I calculated the Fourier transform (as an FFT of length 1024, zero padded) on the original data, and the output is shown on the attached plot.

As you can see, the autocorrelation function does quite a bit better than the FFT to distinguish the frequency of interest, but there are other peaks in the autocorrelation function nearly as large, so it would be difficult to have much confidence in the result.
"It seems to run on some form of electricity"

charliesixpack

#14
Oct 08, 2014, 02:12 am Last Edit: Oct 08, 2014, 02:24 pm by charliesixpack Reason: 1
I think you may have a problem with your code.  You have defined rawData to be an array of type char.  When you perform the multiplication in the step which multiplies the two rawData values you are overflowing the type char.  The multiplication is done on type char before it is assigned to variable sum.
Code: [Select]
   for (k=0; k <len-i; k++)
   {
     sum += (rawData[k]) * (rawData[k+i]);
   }

I have been bitten by this before.   Try casting rawData to long before doing the multiplication and see if you get the same answer.

I analyzed your data in Mathematica and got a much different result for the autocorrelation.  Also, you imply that autocorrelation is a frequency domain technique which it is not.  It is done strictly in the time domain.  Taking the FFT of the autocorrelation gives the power spectrum of the signal putting it in the frequency domain.

The plot below is the periodogram of the data.  Mathematica has a function to find the peaks which are found at the following indices of the data (shown as red dots on the plot)...
{13, 24, 35, 47, 59, 70, 82, 93, 105, 116, 128, 139, 151, 162, 174, \
185, 197, 208, 220, 231, 243, 254, 266, 277, 289, 300, 312, 323, 335, \
346, 358, 369, 381, 392, 404, 415, 427, 438, 450, 461, 472, 485, 487, \
500, 511, 522, 534, 545, 557, 568, 580, 591, 603, 614, 626, 637, 649, \
660, 672, 683, 695, 706, 718, 729, 741, 752, 764, 775, 787, 798, 810, \
821, 833, 844, 856, 867, 879, 890, 902, 913, 925, 937, 948, 959}

The fundamental frequency (259.09 Hz)and overtones really stand out when displayed this way.

Go Up