how to interpret the magnitude of FFT

Hi,

In one of my project, I record an audio using a mic connected to a PC, and calculate the FFT using Python. I used PyAudio for the recording. Upon calculating the magnitude, I noticed that its range can vary depending on the format (16 bit vs 32 bit) of the recording. I don't know if I did something wrong or is there an explanation for this. So how do you magnitude of, say, 150 at 2000Hz or magnitude of 1200 at 4000Hz? Are there any physical meanings to the numbers or are they meaningful only in a relative sense?

Furthermore, I want to take the audio data and convert it to a A-weighted decibel reading much like those given in handheld decibel meters. Is this something I can do from the FFT? A simple example would be nice.

Thanks

Are there any physical meanings to the numbers or are they meaningful only in a relative sense?

Of course there is a physical meaning. The result of the Fourier transform is just another way of representing your data.

In your case, you start with some amplitude data as a function of time, and after transforming, you get amplitude and phase data as a function of frequency. In theory you can reverse Fourier transform the result and get back the exact, original data.

However, the Fast Fourier Transform (FFT) algorithm makes some assumptions about your data (e.g. periodicity) that may introduce additional errors and the mere acts of sampling and digitization have their own, extremely important consequences, but as long as you obey the sampling rules the FFT result amplitudes are meaningful in the absolute sense.

There are many good tutorials and analyses on the web if you want to learn more.

Nothing is calibrated unless/until you calibrate it. :wink: Different microphones have different sensitivity, microphones preamps have different gains, your recording software probably has adjustable gain, and ADCs have different sensitivities. And... Who knows what your FFT software is doing.

Do you have an SPL meter?

Decibels are a relative logarithmic measurement. The formula* is 20(log A/Aref).

So as an example, let's say 80dB SPL reads 150. Now you have a reference. Then if we get a reading of 300 we can calculate the dB difference. 20 x log(300/150) = +6dB. So your SPL level is 86dB.

Furthermore, I want to take the audio data and convert it to a A-weighted decibel reading

For a single frequency you can just should be able to apply an adjustment factor based on the curve. For example, from the low-resolution curve I just looked-up, it looks like you subtract about 15dB at 100Hz.

But, off the top of my head I'm not sure how you combine all of the frequency bands at once to get the A-weighted SPL level of real-world sounds. Maybe you just sum-up the A-weighted amplitudes before making the dB calculation, but it might not be that simple. And, there is usually "leakage"... A pure 1kHz tone may "leak" into the other FFT bins.

*That's the amplitude formula. The power formula is 10(log P/Pref). i.e. When you double the amplitude (such as doubling the voltage) the power goes up by a factor of 4 and that's a 6dB change in either case.

DVDdoug:
Nothing is calibrated unless/until you calibrate it. :wink: Different microphones have different sensitivity, microphones preamps have different gains, your recording software probably has adjustable gain, and ADCs have different sensitivities. And… Who knows what your FFT software is doing.

Thanks.

So as long as I’m only interested in the relative difference between two sounds, then as long as I use the same setup, I can just compare the FFT output right?

Define “the relative difference”.

The FFT output reflects the microphone/amplifier sensitivity and frequency response, plus any artifacts that derive from sampling and aliasing, in addition to the frequency and amplitude contents of the sounds themselves,

If you look at the Arduino FFT examples, you will see it can handle a MAXIMUM of 64 discrete samples. Will this work for your project?

Paul

the Arduino FFT examples, you will see it can handle a MAXIMUM of 64 discrete samples

Neither of the "Arduino FFT" libraries I have used suffer from that limitation.

http://wiki.openmusiclabs.com/wiki/ArduinoFFT (256 samples)
or
GitHub - kosme/arduinoFFT: Fast Fourier Transform for Arduino (128 samples on an Uno)

jremington:
Define "the relative difference".

The FFT output reflects the microphone/amplifier sensitivity and frequency response, plus any artifacts that derive from sampling and aliasing, in addition to the frequency and amplitude contents of the sounds themselves,

By "relative difference", I mean if I use the same setup to record 2 different sound (let's say constant 1000hz tone but one has twice as much amplitude than the other0 for 1 sec each, and calculate the FFT using the same program. I should expect "roughly" double the FFT magnitude at around the 1000 Hz mark right?

Thanks

If all the noted restrictions are obeyed, the FFT result will be 2.00x, not “roughly double”. This is a quantitative measurement.

You can go about this intelligently! Test the FFT with known data, where you know what to expect, as in the following example.

*
 fft_test_sine
 example sketch for testing the OpenMusicLabs fft library.
 This generates a simple sine wave data set consisting
 of two frequences f1 and f2, transforms it, calculates 
 and prints the amplitude of the transform.
 */

// do #defines BEFORE #includes
#define LIN_OUT 1 // use the lin output function
#define FFT_N 64 // set to 64 point fft

#include <FFT.h> // include the library

void setup() {
  Serial.begin(9600); // output on the serial port
  int i,k;
  float f1=2.0,f2=5.0;  //the two input frequencies (bin values)

  for (i = 0 ; i < FFT_N ; i++) { // create samples
    // amplitudes are 1000 for f1 and 500 for f2
    k=1000*sin(2*PI*f1*i/FFT_N)+500.*sin(2*PI*f2*i/FFT_N);
    fft_input[2*i] = k; // put real data into even bins
    fft_input[2*i+1] = 0; // set odd bins to 0
  }
  
//  fft_window();  //Try with and without this line, it smears

  fft_reorder(); // reorder the data before doing the fft
  fft_run(); // process the data using the fft
  fft_mag_lin(); // calculate the magnitude of the output

  // print the frequency index and amplitudes

  Serial.println("bin  amplitude");
  for (i=0; i<FFT_N/2; i++) {
    Serial.print(i);
    Serial.print("       ");
    Serial.println(2*fft_lin_out[i]); //*2 for "negative frequency" amplitude
  }
  Serial.println("Done");
}

void loop() {}

paulwece:
By "relative difference", I mean if I use the same setup to record 2 different sound (let's say constant 1000hz tone but one has twice as much amplitude than the other0 for 1 sec each, and calculate the FFT using the same program. I should expect "roughly" double the FFT magnitude at around the 1000 Hz mark right?

I second your doubt about exact signal capture. The resulting ratio should be quite close to 2, but not a log(2) or something else.

Do you know how to enforce a frequency mark at 1kHz in the FFT result? Remember that the amplitude will be split into two if it does not match a discrete step on the frequency scale.

The resulting ratio should be quite close to 2

To better than 1 part in 100 using the integer data OpenMusicLabs library, and to better than 1 part in about 104, using the floating point data “Arduino FFT” library"

Depending on the number of samples, accuracy and granularity of the digitized signal amplitudes, and how equidistant they are in time. Noise and aliasing also can disturb the result.

Very helpful, thanks.

DrDiettrich:
Do you know how to enforce a frequency mark at 1kHz in the FFT result? Remember that the amplitude will be split into two if it does not match a discrete step on the frequency scale.

Set the frequency resolution (which depends on sample rate, number of samples…) so that from 0, 1khz would match a discrete step? So a frequency resolution of 5hz would work?

jremington:
If all the noted restrictions are obeyed, the FFT result will be 2.00x, not "roughly double". This is a quantitative measurement.

I simulated the FFT on generated sine curves with known freq. and amplitude and it works fine. It's just that when I record a sound from the real world, I'm not sure how to interpret it. For example, does the magnitude doubles if I encode using 16 bit rather than 8 bit, if everything else is the same?

if I encode using 16 bit rather than 8 bit, if everything else is the same?

The decimal value 100 has a very well defined and universally understood meaning, regardless whether it is stored in a 16 bit integer variable or an 8 bit integer variable.

But, as mentioned in reply #2, the amplitudes don't mean anything until you have calibrated your entire setup.

jremington:
The decimal value 100 has a very well defined and universally understood meaning, regardless whether it is stored in a 16 bit integer variable or an 8 bit integer variable.

Based on some reading, here's how I understand how it roughly works:

Microphone outputs a voltage based on sound pressure. The microphone voltage is amplified and the amplified voltage goes into an ADC. The amplifier gain is such that the maximum microphone output is matched to the maximum input voltage of the ADC. The ADC converts the input voltage into a number, and this number's magnitude can depend on the number of bits. So let's say the max input voltage is 5V, this would be 255 for 8 bit, or 65535 for 16 bit.

When I record an audio stream using something like Python, the audio is composed of numbers that the ADC outputs, and that can vary based on if I used 8 bit or 16 bit encoding. Therefore, the FFT magnitude will vary too.

Please correct me if I'm wrong.

If you want to make quantitative rather than relative measurements, your theoretical procedure will not work, because it omits the required step of converting ADC values to voltage values.

The voltage output by the microphone does not depend on the number of bits used in the ADC encoding.