Go Down

### Topic: Reliable Frequency Measurment Using Autocorrelation (Read 32929 times)previous topic - next topic

#### akellyirl ##### Oct 23, 2013, 10:20 pm
Accurate Frequency Detection is important for many projects such as Guitar/Piano Tuners, Vibration Analyzers, Heartrate Monitors, MEMs Sensor Analysis and Laboratory Instruments.

There have been many fine examples of projects that try to solve this problem, for example: Arduino Frequency Detection by amandaghassaei on Instructables and Arduino Frequency Counter Library, but they all use Time Domain techniques; analyzing the signal for features such as : Zero-Crossings, Peak Detection, Slope Detection etc..

Waveforms with very strong harmonic content makes the fundamental frequency undiscernable and signals like that will not be identifiable using Time Domain techniques.

I've written an instructable on how to use Autocorrelation to reliably Identify the frequency of signals in the presence of noise and strong harmonic content.

Actually is surprisingly easy.

Here it is.
http://www.instructables.com/id/Reliable-Frequency-Detection-Using-DSP-Techniques/

#### robtillaart #1
##### Oct 28, 2013, 09:10 am
Hi Akellyirl,

Nice piece of work, I took the freedom to refactor and optimize your code.

+ created function:  float autoCorrelateFrequency(char * sample, int len, float sampleFreq)
+ made datatype signed char, to remove the -128 from inner loop formula
(preprocess the data array once)
+ removed the division by 256 from the inner loop
+ added endstate!=3 condition in the outer loop to stop processing asap
+ refactor

The above changes improved the performance from 8613 -> 139 millis to determine the 259.41 Hz in the sample signal.
Given that the UNO can make 8000 10bit  samples / second it should be possible to determine frequencies up to 2Khz
(4000 samples) in less than a second.

(sample.h file is attached)
Code: [Select]
`//    FILE: frequencyDSP.ino//  AUTHOR: rob tillaart (org version akellyirl)// VERSION: 0.1.03// PURPOSE: frequency analysis using DSP//    DATE: 2013-10-27//     URL: http://www.instructables.com/id/Reliable-Frequency-Detection-Using-DSP-Techniques#include "sample.h"void setup() {  Serial.begin(115200);  Serial.println("FrequencyDSP");  Serial.println("------------");  uint32_t start = millis();  float fr = autoCorrelateFrequency(rawData, sizeof(rawData), 22050);  uint32_t stop = millis();  Serial.print("freq:\t");  Serial.println(fr);  Serial.print("time:\t");  Serial.println(stop - start);}void loop() {}float autoCorrelateFrequency(char * sample, int len, float sampleFreq){  long sum = 0;  long sum_old = 0;  int thresh = 0;  byte pd_state = 0;  int period = 0;  // 0 results in inf  // Autocorrelation  for (int i=0; i < len && (pd_state != 3); i++)  {    sum_old = sum;    sum = 0;    for (int k=0; k <len-i; k++)    {      sum += (rawData[k]) * (rawData[k+i]);    }    sum /= 256;    // Peak Detect State Machine    // 0: initial, set threshold    // 1: detect positive slope    // 2: detect negative slope    // 3: peak found    if (pd_state == 0)    {      thresh = sum / 2;      pd_state = 1;    }     else if (pd_state == 1 && (sum > thresh) && (sum - sum_old) > 0)     {      pd_state = 2;    }    else if (pd_state == 2 && (sum - sum_old) <= 0)     {      period = i;      pd_state = 3;    }   }  return sampleFreq/period;}`

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

#### curro92 #2
##### Sep 22, 2014, 06:59 am
hi,
in the example as input file is used sample.h raw. How would you read samples from an analog input of Arduino?

Thanks

#### MarkT #3
##### Sep 22, 2014, 04:36 pm
Read some samples on a regular basis, shift them right two bits and store
in the char * that autoCorrelateFrequency () expects?
[ I will NOT respond to personal messages, I WILL delete them, use the forum please ]

#### jremington #4
##### Sep 23, 2014, 01:30 amLast Edit: Sep 23, 2014, 03:05 am by jremington Reason: 1
A problem with the code as posted is that the autocorrelation sum depends on the number of samples correlated, which become smaller as the index (i) increases. That is, the autocorrelation sum is not properly scaled.

I fixed that, and also eliminated the condition that the program stop as soon as it found a large peak.  A plot of the results shows a number of large peaks in the autocorrelation function, so the result is less clean than the OP claims. (The plot becomes noisier toward the right end, because the number of samples included in the sum is small and the error increases).

The inner loop now looks like this (should probably divide by (k-1), but that is a minor detail)
Code: [Select]
`    for (k=0; k <len-i; k++)    {      sum += (rawData[k]) * (rawData[k+i]);    }    sum /= k ;    fprintf(fp,"%ld\n",sum);`

Note: normally the average value of the data points is subtracted from each term, but here, the average is not large (-4).

#### robtillaart #5
##### Sep 23, 2014, 09:12 pm
@jremington
can you post your complete version?
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

#### jremington #6
##### Sep 24, 2014, 05:50 am
See below. This simplified version is for Code::Blocks on a PC, because I wanted to write out a file of the results for plotting and inspection.

Note: another problem with the original code is that the peak detect algorithm detects only the first peak (not necessarily the largest). I did not bother to fix that.

I know this post is old, and several thousand people have looked at it without commenting, so there probably isn't much interest in the technique.

Code: [Select]
`#include <stdio.h>#include <stdlib.h>#include "sample.h"float autoCorrelateFrequency(char * sample, int len, float sampleFreq){  long sum = 0;  long sum_old = 0;  int thresh = 0,i,k;  char pd_state = 0;  int period = 0;  // 0 results in inf    FILE *fp;    fp=fopen("test.txt", "w");  // Autocorrelation  for (i=0; i < len; i++)  {    fprintf(fp,"%d,%d,",i,rawData[i]);    sum_old = sum;    sum = 0;    for (k=0; k <len-i; k++)    {      sum += (rawData[k]) * (rawData[k+i]);    }    sum /= k ;  //this should be k-1, but if so the last point will result in divide by zero    fprintf(fp,"%ld\n",sum);    // Peak Detect State Machine    // 0: initial, set threshold    // 1: detect positive slope    // 2: detect negative slope    // 3: peak found    if (pd_state == 0)    {      thresh = sum / 2;      pd_state = 1;    }    else if (pd_state == 1 && (sum > thresh) && (sum - sum_old) > 0)    {      pd_state = 2;    }    else if (pd_state == 2 && (sum - sum_old) <= 0)    {      period = i;      pd_state = 3;    }  }    fclose(fp);  return sampleFreq/period;}int main(){  float fr = autoCorrelateFrequency(rawData, sizeof(rawData), 22050);  printf("freq: %f\n",fr);    return 0;}`

#### curro92 #7
##### Sep 29, 2014, 08:30 amLast Edit: Sep 29, 2014, 08:32 am by curro92 Reason: 1
hi,
something is wrong:
I have compiled the code of robtillaart, I used sample.h of instructables where I added a global variable
const float sample_freq = 22050.0;
(so every sample.h can have different sample rate)

I get the freq = 5512.5Hz result, but the frequency of instructables sample.h  is 260Hz  (with their code I get 259.4Hz)

With the source code of jremington I get the same result.

What am I doing wrong?

#### robtillaart #8
##### Sep 29, 2014, 10:43 am
please post the output of the sketch you attached.

which board are you using?
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

#### curro92 #9
##### Sep 29, 2014, 11:16 am
The output is:
Quote
samples: 970, sample_freq: 22050.000000
freq: 5512.500000

I've compiled on XP as
gcc  frequencyDSP_robtillaart.c  -o frequencyDSP_robtillaart.exe

And I have compiled in Linux Mint 17 as

gcc  frequencyDSP_robtillaart.c   -o frequencyDSP_robtillaart

The output is the same in both cases.

#### curro92 #10
##### Sep 29, 2014, 11:29 am

With this other instructables code the output is ok:

len: 970 sample_rate: 22050.000000
freq: 259.411774

#### robtillaart #11
##### Sep 29, 2014, 12:05 pm
could be related to size of datatypes?  32 bit - 16 bit
Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

#### KenF #12
##### Sep 29, 2014, 12:19 pm
Guys.  This is great work.
I would suggest that the next step would be..
Having found the dominant frequency, subtract it from the raw data.  Then go through the whole process again to find the next most dominant frequency.
This can be repeated until the rawdata minus all of the frequencies found = 0. That would be a really interesting toy.  Possibly even able to identify someone by their voice.

#### jremington #13
##### Sep 29, 2014, 07:51 pm
Just for completeness, I calculated the Fourier transform (as an FFT of length 1024, zero padded) on the original data, and the output is shown on the attached plot.

As you can see, the autocorrelation function does quite a bit better than the FFT to distinguish the frequency of interest, but there are other peaks in the autocorrelation function nearly as large, so it would be difficult to have much confidence in the result.

#### charliesixpack #14
##### Oct 08, 2014, 02:12 amLast Edit: Oct 08, 2014, 02:24 pm by charliesixpack Reason: 1
I think you may have a problem with your code.  You have defined rawData to be an array of type char.  When you perform the multiplication in the step which multiplies the two rawData values you are overflowing the type char.  The multiplication is done on type char before it is assigned to variable sum.
Code: [Select]
`    for (k=0; k <len-i; k++)    {      sum += (rawData[k]) * (rawData[k+i]);    }`
I have been bitten by this before.   Try casting rawData to long before doing the multiplication and see if you get the same answer.

I analyzed your data in Mathematica and got a much different result for the autocorrelation.  Also, you imply that autocorrelation is a frequency domain technique which it is not.  It is done strictly in the time domain.  Taking the FFT of the autocorrelation gives the power spectrum of the signal putting it in the frequency domain.

The plot below is the periodogram of the data.  Mathematica has a function to find the peaks which are found at the following indices of the data (shown as red dots on the plot)...
{13, 24, 35, 47, 59, 70, 82, 93, 105, 116, 128, 139, 151, 162, 174, \
185, 197, 208, 220, 231, 243, 254, 266, 277, 289, 300, 312, 323, 335, \
346, 358, 369, 381, 392, 404, 415, 427, 438, 450, 461, 472, 485, 487, \
500, 511, 522, 534, 545, 557, 568, 580, 591, 603, 614, 626, 637, 649, \
660, 672, 683, 695, 706, 718, 729, 741, 752, 764, 775, 787, 798, 810, \
821, 833, 844, 856, 867, 879, 890, 902, 913, 925, 937, 948, 959}

The fundamental frequency (259.09 Hz)and overtones really stand out when displayed this way.

Go Up