Arduino pitch detection in realtime (using autocorrelation and peak detection)

Hello,

I'm trying to edit the code below to make it work for an analog signal from a microphone. The goal is to detect the fundamental frequency in realtime using autocorrelation and peak detection. I'm expecting multiple harmonics, that is why i'm using this algorithm and not a frequencycounter for example. I'm encountering a number of problems:

Due to the fact that this code is using the datatype unsigned char (the rawData array), i'm not sure how to implement readings from the analog port. How do i adjust the code to make it work for analog values?
Next to that, i've noticed that the array uses a lot of memory. I'd like to know if there is a way to handle individual values, instead of saving them all to an array first. I've tried to fill the array with analog values devided by 4, and also added some code to start recording when a certain level is reached. These changes were not succesfull.

The code was written by akellyirl and revised by robtillaart. I'm an engineering student and have basic knowledge about programming Arduino and maths concerning pitch detection. Please have a look at the code below and thank you in advance. I've also attached the link to the original topic:
https://forum.arduino.cc/index.php?topic=195085.0

//    FILE: frequencyDSP.ino
//  AUTHOR: rob tillaart (org version akellyirl)
// VERSION: 0.1.03
// PURPOSE: frequency analysis using DSP
//    DATE: 2013-10-27
//     URL: http://www.instructables.com/id/Reliable-Frequency-Detection-Using-DSP-Techniques

#include "sample.h"

void setup()
{
  Serial.begin(115200);
  Serial.println("FrequencyDSP");
  Serial.println("------------");

  uint32_t start = millis();
  float fr = autoCorrelateFrequency(rawData, sizeof(rawData), 22050);
  uint32_t stop = millis();

  Serial.print("freq:\t");
  Serial.println(fr);
  Serial.print("time:\t");
  Serial.println(stop - start);
}

void loop()
{
}

float autoCorrelateFrequency(char * sample, int len, float sampleFreq)
{
  long sum = 0;
  long sum_old = 0;
  int thresh = 0;
  byte pd_state = 0;
  int period = 0;  // 0 results in inf

  // Autocorrelation
  for (int i=0; i < len && (pd_state != 3); i++)
  {
    sum_old = sum;
    sum = 0;

    for (int k=0; k <len-i; k++)
    {
      sum += (rawData[k]) * (rawData[k+i]);
    }
    sum /= 256;

    // Peak Detect State Machine
    // 0: initial, set threshold
    // 1: detect positive slope
    // 2: detect negative slope
    // 3: peak found
    if (pd_state == 0)
    {
      thresh = sum / 2;
      pd_state = 1;
    }
    else if (pd_state == 1 && (sum > thresh) && (sum - sum_old) > 0)
    {
      pd_state = 2;
    }
    else if (pd_state == 2 && (sum - sum_old) <= 0)
    {
      period = i;
      pd_state = 3;
    }
  }

  return sampleFreq/period;
}

Hello again,

Some extra information that could be usefull. I'm using the Arduino Uno and the microphone is a simple MAX9812 mic with a fixed gain of 20 dB.

Greetings,
Bart

The analog examples in the IDE show you exactly how to implement an analog input.

Paul

I'd like to know if there is a way to handle individual values, instead of saving them all to an array first.

If you don’t store all the values in an array how are you going to search for a peak or do an auto correction? You need to know the history of the waveform to make it work.
You could implement a circular buffer to store the last N samples that might help but I don’t hold out much hope for your compleat project.

Paul_KD7HB:
The analog examples in the IDE show you exactly how to implement an analog input.

Paul

Thank you Paul, i figured that I can use mapping to fix the unsigned char problem. Am I right?

Grumpy_Mike:
If you don’t store all the values in an array how are you going to search for a peak or do an auto correction? You need to know the history of the waveform to make it work.
You could implement a circular buffer to store the last N samples that might help but I don’t hold out much hope for your compleat project.

Thanks for your quick reply. I don't expect that storing the entire array will cause any problems. Just to be safe, and not run out of memory, I wondered if there was an alternative. What are your reasons for having little hope at succes for this project, and do they change if I use the full array?

Bart

A couple points . . .

  1. rawData is an array of one byte signed integer, not unsigned.

  2. analogRead returns an int (16 bit signed), with 10 bits unsigned significant. The options are to recast this to char and use autoCorrelateFrequency as is or to modify that function to use ints.

  3. To capture an AC audio input, the AC signal needs to be DC biased to the midpoint of the A/D converter and then the offset needs to be subtracted from the digitized value to retain only the AC component. Errors in doing this will significantly degrade the correlator performance.

Thus the data capture looks something like this:

   for (int k=0; k<analogSamples; k++)
  {
    // Read analog value, subtract offset, shift to discard 2 lsbs, recast to char
    rawData[k] = char((analogRead(analogPin)-512) >> 2) ;
  }

I played around a bit with this starting with robtillaart's code incorporating some of the comments from this thread and analog input as per my post above. I had it working reasonably well, though with the odd one-measurement-per-reset construct before discovering that the original poster had a version with analog input on github.

I've adapted that code by adding a dynamic DC offset correction and some other cosmetic changes to get to the following which works reasonably well. I doubt it has sufficient accuracy to be all that useful as a guitar tuning aid. As I noted in the old thread some sort of interpolation between correlation points would be a nice addition.

// https://github.com/akellyirl/Arduino-Guitar-Tuner/blob/master/Tuner_aRead.ino

#define LENGTH 512

byte rawData[LENGTH];
int count;

// Sample Frequency in kHz
const float sample_freq = 8919;

int len = sizeof(rawData);
int i, k;
long sum, sum_old;
int thresh = 0;
float freq_per = 0;
byte pd_state = 0;

void setup() {
  // analogReference(EXTERNAL);   // Connect to 3.3V
  analogRead(A0);
  Serial.begin(115200);
  count = 0;
}

void loop() {
  // Collect analog signal
  for (count = 0; count < LENGTH; count++) {
    rawData[count] = analogRead(A0) >> 2;
  }

  // Calculate mean to remove DC offset
  long meanSum = 0 ;
  for (count = 0; count < LENGTH; count++) {
    meanSum += rawData[count] ;
  }
  char mean = meanSum / LENGTH ;

  // Autocorrelation stuff
  sum = 0;
  pd_state = 0;
  int period = 0;
  for (i = 0; i < len; i++)
  {
    // Autocorrelation
    sum_old = sum;
    sum = 0;
    for (k = 0; k < len - i; k++) sum += (rawData[k] - mean) * (rawData[k + i] - mean) ;
    // Serial.println(sum);

    // Peak Detect State Machine
    if (pd_state == 2 && (sum - sum_old) <= 0)
    {
      period = i;
      pd_state = 3;
    }
    if (pd_state == 1 && (sum > thresh) && (sum - sum_old) > 0) pd_state = 2;
    if (!i) {
      thresh = sum * 0.5;
      pd_state = 1;
    }
  }
  // for(i=0; i < len; i++) Serial.println(rawData[i]);
  // Frequency identified in Hz
  if (thresh > 100) {
    freq_per = sample_freq / period;
    Serial.println(freq_per);
  }
}

Dear MrMark,

Thank you a lot for your reply. As for the accuracy, I'm wondering if applying some digital filters would improve the results. Do you have any experience with them? I've also read about the YIN algorithm. Would that be a better choice for this application?

Bart

Another question, how did you determine the sample frequency?

Thanks in advance,
Bart

barthulsen:
Dear MrMark,

Thank you a lot for your reply. As for the accuracy, I'm wondering if applying some digital filters would improve the results. Do you have any experience with them?

Without having done any analysis, it's not obvious to me how filtering helps here excepting removing the DC component as noted above. That said, filtering out frequencies out of the range of interest couldn't hurt. This technique will degrade in the presence of aliased frequencies, that is frequencies above half the sampling rate, but those have to be filtered prior to digitization.

I've also read about the YIN algorithm. Would that be a better choice for this application?

I haven't had a chance to study the paper, but they claim it is an improvement on straight autocorrelation.

Just looking at the present algorithm, if a peak detection algorithm that can recognize correlation peaks beyond the first could be implemented, it would mitigate the frequency resolution degradation with increasing input frequency issue. As noted above interpolating between sample points should also help resolution.

barthulsen:
Another question, how did you determine the sample frequency?

In the unposted version of the code, I captured the Arduino "millis()" before and after the "Collect analog sample" loop and divided the number of samples by the difference to get the sample rate (about 8200). I had an additional math operation in that version so the 8919 in the github code posted above looked plausible enough that I left it as is for this exercise.

Hello,

After some further testing it appears that your code, MrMark, is making constant errors. With some statistics i might be able to eleminate them and get good results. As for the filtering i will implement a piece of code that only uses analog readings in the range that i need.

I'm new to interpolating. Therefore i don't understand your comment on making it more accurate. I do appreciate the help ofcourse!

Bart

How can i convert the peak detection to work for multiple harmonics? In this tutorial it is done, but i don't know how to adapt the code above.

barthulsen:
I'm new to interpolating. Therefore i don't understand your comment on making it more accurate. I do appreciate the help of course!

Autocorrelation produces outputs at discrete time offsets, corresponding to discrete frequencies. In the diagram these are represented by the red dots. By applying a spline fit to the discrete data one can see that a better estimate of the peak (black arrow) might be somewhere in between the highest point (blue arrow) and the highest point adjacent to it.

corrPeak.PNG

That's a very clear explanation, thanks. Could this also be the cause for the error that gets larger when the frequency gets higher? The first thing I'm going to try is to use a frequency sweep to determine a correction formula.

Sometimes the output is a harmonic and not the fundamental frequency. Therefore i'll also try to check all the peeks with the peak detection state machine. I'd greatly appreciate if someone would help me out with that.

Sometimes the output is a harmonic and not the fundamental frequency.

Yes that is the main reason why projects of this type do not work correctly.

I'd greatly appreciate if someone would help me out with that.

That is the big problem, crack that and you are doing well. Many times the strongest harmonic is not the fundamental.

Hi,

I think the solution is to check multiple peaks in the autocorrelation function, and pick out the one with the lowest frequency as the fundamental. Am I right? Therefore I need to convert the peak detection state machine to check multiple peaks. That is the problem for me at the moment, I'm not sure how to do that.

Thanks,
Bart

The main problem is that I do not understand this piece of code:

if (!i) {
      thresh = sum * 0.5;
      pd_state = 1;
    }

Why the !i? And what does the variable thresh do?

Bart

Why the !i

It means do it on only the first iteration of the loop. Some one clever writing the code would not try and be so clever and just write

if(i != 0)

And what does the variable thresh do

It calculate the threshold, to be used later, as half the total sum.

It would be better as

thresh = sum / 2;

As then it would work for both integer or float numbers.

Thank you for the explenation! What do you think about my idea to use a frequency sweep to determine a correction formula? I'm noticing that the code is printing very constant outputs, but they're off with an error that increases with the frequency. Next to that i'll convert the peak detection to work for multiple peaks and choose the one with the lowest frequency as the fundamental.

If you are printing as you are sampleing that will screw up your sample time.

What do you think about my idea to use a frequency sweep to determine a correction formula?

Given the fact that I don’t think you have much chance of getting your project to work then not much.