Go Down

Topic: FFT of .wav File (Read 903 times) previous topic - next topic

M6Gpower

Hi , i want convert the audio.wav (10sec) file that is stored on a SD-card into FFT.
Is it also possible to store the FFT data to the SD card then?

MarkT

Take the data, apply the FFT, store the data.

But you have provided no details for anything concrete...

Convert with what?  Store in what format?  What sample depth and rate?
[ I will NOT respond to personal messages, I WILL delete them, use the forum please ]

M6Gpower

#2
Nov 06, 2018, 03:11 pm Last Edit: Nov 06, 2018, 03:13 pm by M6Gpower
Take the data, apply the FFT, store the data.

But you have provided no details for anything concrete...

Convert with what?  Store in what format?  What sample depth and rate?
Some questions:
If i speak trough a microphone , can i store the data in FFT?
What kind of format will this be? txt data where the binary things are stored or a .fft data ?
What is the Sample depth and rate if a human speak? Wikipedia says - 12-80Hz is that right? that multiplied by 2 is 160Hz right? and the depth would be 8-bit or 16-bit maybe?

MarkT

#3
Nov 06, 2018, 03:27 pm Last Edit: Nov 06, 2018, 03:29 pm by MarkT
You'll have to learn about sample rates and depths if you want to run FFTs over sampled data...

Typical audio sample rates vary from 8kHz to 384kHz, sample depths 8, 12, 16, 24 bits.
The sample rate has to be more than twice the highest signal frequency of interest (which
is 3kHz for speech, 15kHz or more for music, hence 8kHz and 44.1 and 48kHz sampling rates being
very common).

If you speak into a microphone a very small voltage comes out the microphone cable - typically to a
microphone amplifier, then you'd have to sample it, then do any further processing.

The format for your data is whatever you want it to be, FFT's output an array of complex numbers,
which will probably therefore mean arrays of floating point (although fixed point is also possible).

Start looking at existing examples of FFT processing on Arduinos, there are libraries out there.
[ I will NOT respond to personal messages, I WILL delete them, use the forum please ]

DVDdoug

I've never used FFT, but I've studied it a bit...

Before we get into that, the Audacity website has an Introduction To Digital Audio.

Quote
If i speak trough a microphone , can i store the data in FFT?
Recording on the Arduino is tricky.   In theory digital recording is easy but as far as I know only a few people have been successful with the Arduino.   I think the problem is writing the data to the SD card fast-enough so you're ready to read the next sample.   Of course the timing is critical.  (And, the RAM is too small to store a useful amount of audio so you need external memory.)

FFT on the Arduino is too slow to work continuously in real-time.    Typically, it samples the incoming audio for some number of samples.   Then it stops sampling and runs the FFT before grabbing the next batch of samples.    It's fast-enough for a spectrum analyzer effect but in reality you're not capturing/analyzing all of the audio.

Of course, if you are running FFT on a WAV file in memory you can "take all day" and you don't have to skip any data.

Quote
What kind of format will this be? txt data where the binary things are stored or a .fft data ?
As far as I know there is no standard FFT file format.    Typically, the FFT data file will be larger than the WAV file but that depends on the resolution you choose.    

Grumpy_Mike

FFTs are done on a data set that is a multiple of a power of two. So something like chunks of 128,256, ore 512 samples at a time.

So to take the FFT of 10 seconds of audio, you have to record onto an SD card, then take the samples out say 256 at a time, do the FFT and then save it back into another file. The FFT needs space to work things out and you can't make the frames too big on an Arduino.

What do you want to do with this FFT data?

MarkT

Strictly speaking you can compute a fast DFT on any sample count (even primes), but powers of two are much
the simplest and most efficient, and all the example code you'll find will assume a power of two.
[ I will NOT respond to personal messages, I WILL delete them, use the forum please ]

M6Gpower

FFTs are done on a data set that is a multiple of a power of two. So something like chunks of 128,256, ore 512 samples at a time.

So to take the FFT of 10 seconds of audio, you have to record onto an SD card, then take the samples out say 256 at a time, do the FFT and then save it back into another file. The FFT needs space to work things out and you can't make the frames too big on an Arduino.

What do you want to do with this FFT data?
Actually i want that the device can recognition a wave form of a human speech and then search for the pair in the SD-card.

 

Grumpy_Mike

Quote
i want that the device can recognition a wave form of a human speech and then search for the pair in the SD-card.
Ah, speech recognition. I think you should be prepared for disappointment here.

No two speech sample will produce identical transforms so the searching you will have to do with be fuzzy. That is at best give you the probability of a match. And this is only if you can synchronise the start of your recording with the start of the recording of the template you are matching against.

If speech recognition were that easy they would have cracked by now.

M6Gpower

#9
Nov 14, 2018, 01:22 pm Last Edit: Nov 14, 2018, 01:52 pm by M6Gpower
Ah, speech recognition. I think you should be prepared for disappointment here.

No two speech sample will produce identical transforms so the searching you will have to do with be fuzzy. That is at best give you the probability of a match. And this is only if you can synchronise the start of your recording with the start of the recording of the template you are matching against.

If speech recognition were that easy they would have cracked by now.
So lets say i start my two recordings with a "Peep"-sound and after that it starts record for 5 sec , is it then possible to find a match?

I have to convert both recording files to FFT and compare them , would it work?

MarkT

Voice recognition needs a spectrogram, ie a plot of frequencies against time, a single FFT doesn't give you
the variation with time, just the total energy at each frequency.
[ I will NOT respond to personal messages, I WILL delete them, use the forum please ]

Grumpy_Mike

#11
Nov 14, 2018, 08:06 pm Last Edit: Nov 14, 2018, 08:12 pm by Grumpy_Mike
Quote
would it work?
No.

The amount of sound you can take an FFT of is in the order of microseconds. The exact number depends on the sample rate. Taking a sequence of samples and then taking the FFT and then repeating it will leave holes in the sound recorded.

If you want to look at an FFT then get the free app called Audacity, that will run on a laptop and will give you a feel of the repeatability of an FFT and sound recording.

M6Gpower

No.

The amount of sound you can take an FFT of is in the order of microseconds. The exact number depends on the sample rate. Taking a sequence of samples and then taking the FFT and then repeating it will leave holes in the sound recorded.

If you want to look at an FFT then get the free app called Audacity, that will run on a laptop and will give you a feel of the repeatability of an FFT and sound recording.
Is there any other solution to compare two soundfiles or compare the speech in a microphone with a soundfile?

Grumpy_Mike

Speech recognition is not practical on a small processor like an Arduino Uno. If that is what your question means.

Sure you can compare them but you will never get the result that the two are the same.

Go Up