Go Down

Topic: Arduino pitch detection in realtime (using autocorrelation and peak detection) (Read 6799 times) previous topic - next topic

Grumpy_Mike

Basically an array, with an input pointer and an output pointer given by variables. Two ways you can arrange this but nothing much to choose between them. Have the input pointer pointing to the next free buffer space and the output pointer to the next output value. When ever you put something in or take something out you increment the pointers and make them wrap round, that is start again at zero when it reaches the size of the array.

If on incrementing the input pointer it exceeds the output pointer then your buffer has overflowed.

barthulsen

Thanks! I was thinking about the following. The sample rate determines the resolution, and it is limited by the amount of samples that are taken due to the fact that I do not want to use a lot of memory and I need at least two times the largest period for the autocorrelation to work. What if I use interrupts to get the maximum sample rate for the highest frequency, and bring it down for the other strings / frequencies by ignoring some of the samples. If I'm not mistaken, by doing this I can get a fairly good and above all constant resolution. Combined with interpolation it might be accurate enough for the project to work. What do you think about this idea?

Grumpy_Mike

Well that is known as sub sampling, you are best doing this by altering the timing on the interrupt. However the problem is that you must not fall below the Nyquist rate otherwise the aliasing will give you false results. You get this problem if you sample slower or simply ignore samples.

barthulsen

Hi,

I've been using interrupts to use the maximum sample frequency for every string. I've combined it with continuous sampling on A0. The OCR value is calculated with the following formula:

OCR1A = (time between interrupts / 0.0000000625) - 1

The time between the interrupts is calculated with the maximum period, which I expect to be at a frequency that is about 10 Hz (this is the correction) lower than the fundamental. This period is multiplied with 2, to get the desired window for the autocorrelation, and then divided by the number of samples. See below:

OCR value = 2 * (1 / (string - correction)) / arraySize / 0.0000000625 - 1

The sample frequency is calculated with this OCR value, to eliminate rounding errors.
While testing I noticed that the resolution is better when I change the 2 to 3 or 4. So instead of using 2 periods, I'm using 3 or 4 and therefore lowering the sample frequency. Actually, it's also quite unstable when using the formula above. The results with the increased number of periods are reasonable. However, I'd like to know what is wrong with my calculations and how that I can improve them. I've attached the code.

Thanks in advance,
Bart

MrMark

If I understand correctly, you're getting better results with a sample rate such that you get 3 or 4 full cycles of the fundamental frequency in your 500 sample buffer rather than just 2 full cycles.

One theory for why this might happen is that your audio source contains some non-harmonic content which is better rejected by the longer correlation interval, that is, the more cycles used, the more non-harmonic signal content decorrelates at the lag where harmonic content has peak correlation, thus distorts the correlation peak less.  What are you using for audio in your testing?

Looking at your attached code, it looks like the sample rate chosen for B2 is around 60 kHz which seems unnecessarily high in any case.

barthulsen

Hi, you're right about the increased amount of cycles and their consequence. I'm using a MyDAQ for the audio during tests. As for the high sample rate, the B2 frequency is 246.94 Hz. The formula below yields a resolution of 1 Hz (when using 60 kHz as the sample frequency).

Resolution = f^2 / fs

Why is the sample frequency of 60 kHz unnecessary high? With my knowledge I figured that this is the way to get results that are accurate enough for tuning. The weird thing is, however, that I found two projects that use this technique but do not seem to improve the resolution (which would be terrible for the high strings). I've attached the links below. As for my code, do you have any tips for me on the overall coding style and structure?
Thanks

http://www.ericthuang.com/actuating-guitar-tuner.html
https://www.google.nl/amp/www.instructables.com/id/Guitar-Tuner-With-an-Arduino/%3famp_page=true


MrMark

f^2 / fs describes the error range due to the discrete lags in correlation.  It isn't the complete error model as there will be contributions from signal to noise (and interference such as non-harmonic content theorized as the issue above).  I think a better approach to mitigating the discrete lag issue is interpolation as discussed earlier in the threat.

Aside from the discrete lag issue, the sample frequency needs only to be twice the highest frequency in the signal being digitized which can be controlled by a low pass filter prior to the ADC.  Even high fidelity audio is typically only about 40 kHz sample rate.  Autocorrelation works essentially by measuring the parts of the signal that match up cycle to cycle while rejecting parts of the signal that don't (random noise and non-harmonic cyclic content).  The amount of rejection of the latter is proportional to the integration period so for this you want a longer sample.  Given the limited memory for signal capture, this means there are conflicting requirements in that discrete lag error wants high fs and noise rejection wants a longer signal capture time.  To optimize performance the best compromise between these needs to be determined.

With regard to your coding style, I like that you clearly broke the data capture and frequency determination into their own functions.  This makes "loop" pretty obvious.  In the readData function using "string" as a variable name is not descriptive of what it means and the very similar "String()" is a keyword.  Both of these are bad practice.  In a couple places you use "0.0000000625" which is the reciprocal of 16e6, the MCU master clock frequency.  Using the latter would be more clear.

barthulsen

MrMark, thank you for the very clear explanation and the feedback on my coding. My next step will be to implement interpolation and use lower sample frequencies. Is there a structural way to determine the best relation between the sample frequency and signal capture time, besides trial and error?

barthulsen

So I've implemented interpolation and it works well! For now, I'm keeping the OCR calculation. My main reason for this is that I can easily play around with the signal capture time and try to find the best relation with the sample frequency. The results are quite good, with resolutions that go below 1 Hz. However, there are some weird errors. When testing with the MyDAQ at a constant frequency, the Serial Monitor would plot, for example, 10 good results that are exactly the same and then plots 3 or 4 results (that are very different from each other) with a large error. This goes on at a constant rate. What could be the problem? To get around this issue I might run the calculation a few times and choose the frequeny that is calculated multiple times as the result, or use the average. See the interpolation code below:

Code: [Select]
    else if (pdState == 2 && (currentSum - previousSum) <= 0) {
      // quadratic interpolation
      for (int k = 0; k < arraySize - i + 1; k++) nextSum += (rawData[k] - mean) * (rawData[k + i + 1] - mean);
      float interpolationValue = 0.5 * (nextSum - previousSum) / (2 * currentSum - previousSum - nextSum);
      period = i + interpolationValue;
      pdState = 3;
    }


Bart

MrMark

#39
Apr 21, 2018, 03:06 am Last Edit: Apr 21, 2018, 03:07 am by MrMark Reason: Insert photo link
I think your code now looks something like the following.  I made the buffer 1000 samples which seems to help reduce the noise (as it should), but slows down the autocorrelation rate (also expected).   Using a function generator on my cell phone and going between 320 Hz and 350 Hz sine wave, I got the plot below.  The microphone had to be very close to the phone's speaker.  There's some noise in the measurement and a few outliers, but generally it seems to do pretty well.
Code: [Select]
// Arduino Pitch Detection on A0 with autocorrelation and peak detection
// Original author(s): akellyirl, revised by robtillaart and MrMark

#define E1 329.63
#define B2 246.94
#define G3 196.00
#define D4 146.83
#define A5 110.00
#define E6 82.41

float sampleFrequency;
float frequencyRange = 10;
volatile int interruptCount = 0;
const int arraySize = 1000;
volatile byte rawData[arraySize];
long currentSum, previousSum, nextSum ;
int threshold = 0;
float frequency = 0;
byte pdState = 0;

void setup() {
  noInterrupts();
  TCCR1A = 0;
  TCCR1B = 0;
  TIMSK1 = 0;
  TCCR1B |= (1 << WGM12);
  TCCR1B |= (1 << CS10);
  TIMSK1 |= (1 << OCIE1A);

  ADCSRA = 0;
  ADCSRB = 0;
  ADMUX |= (1 << REFS0);
  ADMUX |= (1 << ADLAR);
  ADCSRA |= (1 << ADPS1);
  ADCSRA |= (1 << ADATE);
  ADCSRA |= (1 << ADEN);
  ADCSRA |= (1 << ADSC);

  Serial.begin(115200);
  pinMode(LED_BUILTIN, OUTPUT);
}

ISR(TIMER1_COMPA_vect) {
  // Collect analog signal
  rawData[interruptCount] = ADCH;
  interruptCount ++;
}

void readData(float baseFreq) {
  int OCRvalue = 2 * (16000000 / (baseFreq - frequencyRange)) / arraySize - 1;
  OCR1A = OCRvalue;
  sampleFrequency = 16000000 / (OCRvalue + 1) ;
  interrupts();
  while (interruptCount != arraySize); // wait for data
  noInterrupts();
  interruptCount = 0;
}

void findFrequency() {
  // Calculate mean to remove DC offset
  long meanSum = 0;
  for (int j = 0; j < arraySize; j++) {
    meanSum += rawData[j];
  }
  char mean = meanSum / arraySize;
  // Autocorrelation
  currentSum = 0;
  pdState = 0;
  int period = 0;
  for (int i = 0; i < arraySize && (pdState != 3); i++) {
    // Autocorrelation
    previousSum = currentSum;
    currentSum = 0;
    for (int k = 0; k < arraySize - i; k++) currentSum += (rawData[k] - mean) * (rawData[k + i] - mean);
    // Peak detection
    if (pdState == 0) {
      threshold = currentSum / 2;
      pdState = 1;
    }
    else if (pdState == 1 && (currentSum > threshold) && (currentSum - previousSum) > 0) {
      pdState = 2;
    }
    else if (pdState == 2 && (currentSum - previousSum) <= 0) {
      // quadratic interpolation
      nextSum = 0 ;
      for (int k = 0; k < arraySize - i + 1; k++) nextSum += (rawData[k] - mean) * (rawData[k + i + 1] - mean);
      float interpolationValue = 0.5 * (nextSum - previousSum) / (2 * currentSum - previousSum - nextSum);
      period = i + interpolationValue;
      pdState = 3;
    }
  }

  // Frequency identified in Hz
  if (threshold > 100) {
    frequency = sampleFrequency / period;
    if (frequency < 400) {
      Serial.println(frequency);
    }
  }
}

void loop() {
  digitalWrite(LED_BUILTIN, HIGH);
  readData(D4);
  digitalWrite(LED_BUILTIN, LOW);
  findFrequency();
}



barthulsen

Hi MrMark,

You're right, the code is almost exactly the same as the current one on my laptop. I'll try to change the buffer to 1000 samples. I agree with you that the results are quite good, but I noticed that you've set the readData function to D4. Wouldn't it be optimized for frequencies from 320 to 350 Hz if you'd choose E1? Asside from that, I believe that with all the help I got from you guys I have a fair change of succeeding in the project :)

As for the weird error that I've written about in my previous post, do you have any idea what could be the cause?

Thanks,
Bart

barthulsen

So, I've been thinking about the results and their stability. It's not really good. I can optimize it for certain frequencies by playing around with the signal capture time, but then the results for other frequencies are heavily influenced by errors. Could the combination of continuous sampling and the use of interrupts be the cause of instability? As i mentioned before, the resolution is quite good, but there are errors that show up at a constant rate and mess things up. I figured that a solution (or a way to isolate the cause of error) could be to use continuous sampling with interrupts that are generated on the reading of a new value, instead of using interrupts to control the sample frequency. The question that I want to ask about this is the following: Can I alter the samplefrequency on an analog port as I did with the OCR timer value (during a program)? Next to that I'm not sure how to implement a change of samplefrequency on a port that is continuously sampling. I'm looking forward to your comments.

MrMark

I've had a bit more time to play with this.  There are two significant changes in this version.  First the data capture uses the ADC in the continuous mode with an interrupt service routine to move the data into the buffer.  Sample rate is fixed at 9615.4 samples per second.  The second change is a correction to the interpolation routine which previously had the peak correlation value as one of the endpoints rather than the center of the three points.  

I've changed the peak detection algorithm implementation to use "case" statements rather than a cascaded "if" statements.  It should be functionally the same, but I think peak detection could be made more robust and this was a first step toward a revision.  Eventually I'd like peak detection to evaluate multiple correlation peaks rather than stopping at the first it finds over the threshold.

Finally I opened up the frequency window test to 100-2000 Hz, largely because the phone app doesn't put out much audio power below about 200 Hz, so it gives me more range to test over. In limited testing this implementation seems to mostly put out accurate values if it puts out anything at all.

Code: [Select]
// Arduino Pitch Detection on A0 with autocorrelation and peak detection
// Original author(s): akellyirl, revised by robtillaart, MrMark, barthulsen
// http://forum.arduino.cc/index.php?topic=540969.15
// Continuous ADC ala http://www.instructables.com/id/Arduino-Audio-Input/ "Step 7"

#define sampleFrequency 9615.4
#define bufferSize 1024

volatile byte  rawData[bufferSize] ;  // Buffer for ADC capture
volatile int sampleCnt ;                    // Pointer to ADC capture buffer
long currentSum, previousSum, twoPreviousSum ;
int threshold = 0;
float frequency = 0;
byte pdState = 0;

void setup() {

  Serial.begin(115200) ;
  pinMode(LED_BUILTIN, OUTPUT);
  cli();//disable interrupts

  //set up continuous sampling of analog pin 0
  //clear ADCSRA and ADCSRB registers
  ADCSRA = 0 ;
  ADCSRB = 0 ;

  ADMUX |= (1 << REFS0) ; //set reference voltage
  ADMUX |= (1 << ADLAR) ; //left align the ADC value- so we can read highest 8 bits from ADCH register only

  ADCSRA |= (1 << ADPS2) | (1 << ADPS1) | (1 << ADPS0); // ADC clock 128 prescaler- 16mHz/128=125kHz->9615 sps
  ADCSRA |= (1 << ADATE); //enable auto trigger
  ADCSRA |= (1 << ADIE) ; //enable interrupts when measurement complete
  ADCSRA |= (1 << ADEN) ; //enable ADC
  ADCSRA |= (1 << ADSC) ; //start ADC measurements
}

ISR(ADC_vect) {     // When ADC sample ready, put in buffer if not full
  if (sampleCnt < bufferSize)
  {
    rawData[sampleCnt] = ADCH ;
    sampleCnt++ ;
  }
}

void readData() {
  sampleCnt = 0 ;
  sei() ;         // Enable interrupts, samples placed in buffer by ISR
  while (sampleCnt < bufferSize) ;  // Spin until buffer is full
  cli() ;         // Disable interrupts
}

void findFrequency() {
  // Calculate mean to remove DC offset
  long meanSum = 0 ;
  for (int k = 0; k < bufferSize; k++) {
    meanSum += rawData[k] ;
  }
  char mean = meanSum / bufferSize ;
  // Remove mean
  for (int k = 0; k < bufferSize; k++) {
    rawData[k] -= mean ;
  }

  // Autocorrelation
  currentSum = 0 ;
  pdState = 0 ;
  for (int i = 0; i < bufferSize && (pdState != 3); i++) {
    // Autocorrelation
    float period = 0 ;
    twoPreviousSum = previousSum ;
    previousSum = currentSum ;
    currentSum = 0 ;
    for (int k = 0; k < bufferSize - i; k++) {
      currentSum += char(rawData[k]) * char(rawData[k + i]) ;
    }
    // Peak detection
    switch (pdState) {
      case 0:   // Set threshold based on zero lag autocorrelation
        threshold = currentSum / 2 ;
        pdState = 1 ;
        break ;
      case 1:   // Look for over threshold and increasing
        if ((currentSum > threshold) && (currentSum - previousSum) > 0) pdState = 2 ;
        break ;
      case 2:   // Look for decreasing (past peak over threshold)
        if ((currentSum - previousSum) <= 0) {
          // quadratic interpolation
          float interpolationValue = 0.5 * (currentSum - twoPreviousSum) / (2 * previousSum - twoPreviousSum - currentSum) ;
          period = i - 1 + interpolationValue ;
          pdState = 3 ;
        }
        break ;
      default:
        pdState = 3 ;
        break ;
    }

    // Frequency identified in Hz
    if (threshold > 100) {
      frequency = sampleFrequency / period;
      if (frequency < 2000) {
        Serial.println(frequency);
      }
    }
  }
}

void loop() {
  digitalWrite(LED_BUILTIN, HIGH);
  readData();
  digitalWrite(LED_BUILTIN, LOW);
  findFrequency();
}

barthulsen

Thanks MrMark for your help. I'll test the code today. I've got only one question. Is it possible and usefull to change to constant samplefrequency that you've implemented to a higher one for the high frequencies / strings?

Edit:
So I've tested the code. The problem with my code was that I had implemented interpolation wrong, and the problem with yours was that the sample buffer was too large. I don't know why, but when I changed it to 512 samples (instead of 1024) it worked unbelievably well! With the larger amount of samples it puts out almost nothing, like you said.

MrMark

The sample frequency can be adjusted in the code of post #42 by setting the values of the A/D pre-scale bits (ADPS0,1,2) in the A/D control/status register A (ADCSRA) register.  The A/D clock is FCPU (16 MHz) divided by the pre-scale (128 in code above) divided by the clocks per A/D output (13).  The pre-scale is a power of two so control of the A/D sample rate frequency is pretty coarse (9615.4 Hz, 19231 Hz, 38462 Hz, ...) .

Code: [Select]
ADCSRA |= (1 << ADPS2) | (1 << ADPS1) | (1 << ADPS0); // ADC clock 128 prescaler- 16mHz/128=125kHz->9615 sps

Atmel-42735-8-bit-AVR-Microcontroller-ATmega328-328P_Datasheet.pdf, see "28.9.2. ADC Control and Status Register A"

Interesting that you got better results with a shorter sample length, I hadn't explored that yet, but was wondering if 1k samples was excessive.  In particular, when this is extended to do multiple peak detection, it will have to calculate the correlation at more lags, not just until it finds the first peak, and that's going to be slow with an excessively long sample buffer.

In the limited testing that I did, I was seeing frequency reports that were within 1% of what I expected.  Most of the error looked like bias rather than random error, which would be consistent with the Arduino Uno clone's ceramic resonator being the dominant error source.

Go Up