Voice recorder: can't create a clear wave file

Hi,

I am trying to make a voice recorder with my Arduino Uno

Here is the material i use:

  • Adafruit SD card reader

  • SD card in FAT32

  • Adafruit Max4466 mic with adjustable gain

  • UNO board

I run the circuit with the 5v pin of the UNO board

I ran the Recording example from the TMRpcm library, and a wave file was created on my SD card but the sound is really awful (i can barely hear myself whistle and there is a really loud noise, impossible to take off with Audacity).

I don't understand why, because my wiring is correct and all my components are new except my sd card which is a couple years old.

I know this is surely a dumb sampling rate problem or something but i can't figure out which one...

Could someone help me?
Thanks!

I've never used TMRpcm and I didn't realize it has a recording function…

I know this is surely a dumb sampling rate problem or something but i can't figure out which one...

Is it 8-bit audio? That's the most foolproof format.

The bytes might be out-of-order if it's 16-bit audio, or if the bit-depth is getting "miscommunicated" the bytes could be scrambled. 8-bit WAV files use unsigned integers, whereas everything else uses signed integers so that's another way things can get fouled-up. Getting the byte-order wrong (mixing-up the high & low byes) or reading 16-bit data as 8-bits, or vice-versa, can totally foul things up, etc.

(If the sample rate is off, it will simply play at the wrong speed.)

This is a WAV file, right? Not raw audio data?

You can check the file with [u]MediaInfo[/u]
to see if the WAV header shows the format that you expect. And/or you can "look at" the [u]WAV header[/u] with a [u]Hex editor[/u].

The WAV file header might give you a clue but if the header is valid, there's no easy way to know if the header information doesn't match the actual audio data... And, there's no easy to check the audio data (other than listening) because those are just bytes and any byte-value could be valid audio data.

DVDdoug:
I've never used TMRpcm and I didn't realize it has a recording function…

It does, but since it's supposedly an "experimental" feature, is disabled by default.

DVDdoug:
Is it 8-bit audio?

Yes. I don't think an AVR microcontroller is capable of stereo or 16-bit recordings with its built-in ADC; even the processing power is not enough for a decent sampling rate at this format. But unsigned 8-bit mono, even at 22050 Hz is possible.

DVDdoug:
This is a WAV file, right? Not raw audio data?

It should create the standard 44-byte RIFF/WAVE header (no ID3 tags and other metadata); in fact, the library keeps track of how many samples were recorded, in order to update the "RIFF" and "data" values accordingly during finalization.

DVDdoug:
And, there's no easy to check the audio data (other than listening)

So we would like to hear what's going on, by sharing to us the resulting audio file as it is.
If you are going to attach it to your next post, keep in mind that the file limit is 1 MB; so don't try to upload a long recording. Also wav files aren't directly allowed here, but you can get around this by compressing it into a zip one.

DVDoug, Thanks for your response.

Here is what Mediainfo tell me about my file: 16 KHZ, 128 Kbps, 8 bits, Unsigned PCM.

On paper, a really normal file right?

Also i checked on a HEX editor and i can confirm it's a WAV header.

Is there any reason to think it's the SD card that screwed the all thing up?

Lucario448:
So we would like to hear what's going on, by sharing to us the resulting audio file as it is.
If you are going to attach it to your next post, keep in mind that the file limit is 1 MB; so don't try to upload a long recording. Also wav files aren't directly allowed here, but you can get around this by compressing it into a zip one.

My last piece of experimental music:

TEST1.WAV.zip (172 KB)

Here is what Mediainfo tell me about my file: 16 KHZ, 128 Kbps, 8 bits, Unsigned PCM.

On paper, a really normal file right?

Yes, that's good if you're making an 8-bit file. So it's the data that's messed-up.

How about the file size and the playing time? Since it's 8-bit audio at 16kHz there should be 16k bytes per second. And, does the playing-time match the recording time?

Is there any reason to think it's the SD card that screwed the all thing up?

No, I don't think so. You can read/write the header so you should be able to read/write beyond the header.

Somewhere, I read about write-speed limits with SD cards but I don't think that's true. I've got a video recorder that records to an SD card.

Yes. I don't think an AVR microcontroller is capable of stereo or 16-bit recordings with its built-in ADC

The ADC is 10-bits, so you could write it into a 16-bit word, and then bit-shift to get full-volume.

Stereo is possible, but since there is one-shared ADC you can't sample left & right a exactly the same time. And, the ADC speed-limitation is also "shared" so you'd have to cut the sample rate in half.

even the processing power is not enough for a decent sampling rate at this format.

I don't think "processing" is the limitation/bottleneck.

P.S.
You know... In theory recording is "easy", especially at 8-bits... You just read the data at a known sample rate and write it to a file. (With a 10-bit ADC you have to right-shift two bits before throwing-away the high-byte of a 16-bit integer.)

The only tricky thing is writing the header, but that's not too bad if you're always writing the same format. And, you do have to go-back and write the data-chunk size when you stop recording.

...Playback is not so easy when you don't have a DAC. :wink:

DVDdoug:
Stereo is possible, but since there is one-shared ADC you can't sample left & right a exactly the same time.

At typical hifi sample rates this is unimportant as the time discrepancy corresponds to a few mm difference,
most I2S DACs and ADCs sample L and R interleaved I think.

At typical hifi sample rates this is unimportant as the time discrepancy corresponds to a few mm difference,
most I2S DACs and ADCs sample L and R interleaved I think.

True, it's not a BIG deal but normally the 2 (or more) channels are sampled on the same clock edge. And, it's more like a couple cm's at "Arduino sample clock rates".

The data on a WAV file interleaved and it may be sent serially over I2S (or USB, S/PDIF, HDMI, or whatever) but the channels are re-synchronized and clocked-out together by the DAC.

Hillcres-Hellio:
My last piece of experimental music:

Nice whistlings ha ha :smiley:

About the recording, the audio actually sounds like it is supposed to (as the header and encoding is concerned); the annoyance comes from a loud pulsing noise that mixes with the microphone's signal.
That pulsing noise is the SCK line of the SPI port; mostly when it has to update data blocks (otherwise a 4 or 8 MHz clock frequency should not be audible for any non-ultrasonic equipment or human being).

The solution sounds bizarre: you have to kind of isolate the ground line for the microphone's output (or the SD card). I said "kind of" because is not literally cutting off the ground connections.
I'll let someone else give you more details about this solution, since for me is hard to explain (being honest, I've never applied that on a similar problem I'm having, because I actually don't understand how to either)

DVDdoug:
Somewhere, I read about write-speed limits with SD cards but I don't think that's true. I've got a video recorder that records to an SD card.

The limit comes mostly from the microcontroller's available RAM and the SPI clock speed. Interacting with the native SDIO protocol rather than ("legacy") SPI, is less than an issue.

A test I've made long ago, I noticed that an ATmega328P at 16 MHz can write at approx. 53 KB/s; being a for loop the only known significant overhead. Maybe I could push 1 or 2 KB/s more if I disable the timer0 overflow interrupt (that makes delay(), millis() and micros() possible).
You may judge if less than 53 KB/s is enough for the application.

DVDdoug:
The ADC is 10-bits, so you could write it into a 16-bit word, and then bit-shift to get full-volume.

I know it's the only way to record the whole resolution, but is not true 16-bit audio nonetheless.

DVDdoug:
Stereo is possible, but since there is one-shared ADC you can't sample left & right a exactly the same time. And, the ADC speed-limitation is also "shared" so you'd have to cut the sample rate in half.

Aside from the sampling rate, the shifted timing of both channels is for me the deal-breaker in stereo capabilities.

DVDdoug:
I don't think "processing" is the limitation/bottleneck.

However:

DVDdoug:
8-bit WAV files use unsigned integers, whereas everything else uses signed integers

Remember that the ADC also spits out unsigned values, and converting to a two-complement (signed integer) encoding (by substraction) takes up several CPU clock cycles; even more when it's 8-bits "wide".
Endianness is not a problem; the RIFF/WAVE standard always expects little-endian (LSB first), while the Arduino also encodes the variables (int and larger) this way.

DVDdoug:
P.S.
You know... In theory recording is "easy", especially at 8-bits... You just read the data at a known sample rate and write it to a file. (With a 10-bit ADC you have to right-shift two bits before throwing-away the high-byte of a 16-bit integer.)

The only tricky thing is writing the header, but that's not too bad if you're always writing the same format. And, you do have to go-back and write the data-chunk size when you stop recording.

This is what the library does. The only aspect you missed is the double buffering necessary to keep sampling even when writting to the card.

TMRpcm does even more complicated stuff to keep the process out of the main code: it uses the input capture register (ICR) to define the sampling rate (triggers the sampler ISR), and one of the output compare registers (OCR) to trigger the ISR that writes to the file if one of the buffers is full.
In order to avoid delays in the sampling, this ISR re-enables only the overflow interrupt (triggered by the ICR); hoping that the full buffer gets completely written before the other one gets full as well. This is why is so important to not choose a very high sample rate.

A similar thing does when playing, although buffer overruns are more unlikely since reading is faster than writting. 44100 Hz playback is still not possible, but this time you can blame the external oscillator; at 16 MHz clock and 8-bit PWM resolution, the maximum frequency it can achieve is 62.5 KHz (fast PWM). For 44 KHz playback, you need at least close to 100 KHz of PWM; and so a faster microcontroller or a real DAC (a resistor ladder on pins 0 to 7, aka port D, should work too).

Yeah, not an intuitive explaination for beginners; but... this is the "magic" behind the library nonetheless.

DVDdoug:
...Playback is not so easy when you don't have a DAC. :wink:

From a certain sampling rate it's true, otherwise PWM is a feasible alternative.

DVDdoug:
And, it's more like a couple cm's at "Arduino sample clock rates".

Maybe the "drag" will become larger if you consider how the ADC in a AVR works. You know, successive approximation takes a while.

Hey Lucario, not very intuitive for me but thanks anyway ahah. What i've been asking myself is why does it happen with my specific circuit and components, and not with the other's?

What are you meaning by isolating the ground pins?

Hillcres-Hellio:
My last piece of experimental music:

Sounds like buffer overruns to me, so regular discontinuities - otherwise the audio is clear (for 8 bit that is).

Anyway I popped the file into matplotlib in Python to see what the values looked like and its interesting:

The samples are regularly abruptly forced to 00, then spend a while wandering in the range FE-FF-00-01
or so, before smoothly slewing back to the actual audio data.

The wandering about 00-FF suggests a small signed value, yet the main signal is clearly an unsigned
sample waveform, yet the abrupt changes suggest simple buffering issues...

Perhaps there's some interaction with the analog input pin from something?

Or the adjustable gain is doing something odd (and has a DC offset)?

MarkT:
The wandering about 00-FF suggests a small signed value, yet the main signal is clearly an unsigned
sample waveform, yet the abrupt changes suggest simple buffering issues...

Are you talking about the buffer of the Atmega 328p?

MarkT:
Perhaps there's some interaction with the analog input pin from something?

I'm using only one analog pin. I tried to change, the 5 Analog pins are giving me this crapy music

MarkT:
Or the adjustable gain is doing something odd (and has a DC offset)?

I don't think so because with an other mic (KY-something) the result is the same.......

Hillcres-Hellio:
Are you talking about the buffer of the Atmega 328p?

ATmega 328p doesn't have a buffer. There's a software buffer in the SD library for writing 512 byte
blocks to the filesystem, and if you look at that graph its obvious the periodicity is 512 samples.

I'm using only one analog pin. I tried to change, the 5 Analog pins are giving me this crapy music

Doesn't matter which pin, something may be interacting with the signal that goes to the pin...
[ actually I've figured it out with a little more thinking ]

I don't think so because with an other mic (KY-something) the result is the same.......

My theory is that when the SD library you are using writes to the SDcard the power supplies are
drooping, causing the signal to go haywire for a bit.

My suspicion is that you are using the 3.3V pin on the Uno to power the SDcard - you may need a
separate 3.3V regulator to supply it, rated at 0.2A or more, to ensure the SDcard gets the power
it needs for writes and erases.

Even if that's not the problem, you may have a ground-loop in your microphone signal handling
(before amplification) which causes the droop on the power supply to couple to the small signal
from the microphone and saturate the preamp.

You need to post your circuit, preferable as a schematic and as a photo, so its possible to figure
out if this sort of thing is happening.

You could try adding some electrolytic decoupling to the 3.3V supply to the SDcard reader.

Hillcres-Hellio:
why does it happen with my specific circuit and components, and not with the other's?

For example?

Even for playback that noise mixes out, but not that loud for some reason.

Hillcres-Hellio:
What are you meaning by isolating the ground pins?

As I said, it involves some filtering that frankly I don't understand how it works. But such technique is applied in high-quality PC sound cards, where no switching noise is "leaked" out at all.

I judge the ground line as more or less the culprit; since the outputs are usually isolated by the DAC chip, but the ground is connecting everything in the circuit directly.

MarkT:
The wandering about 00-FF suggests a small signed value, yet the main signal is clearly an unsigned
sample waveform, yet the abrupt changes suggest simple buffering issues...

By trying to import the audio file as "raw data" in Audacity (and so ignoring any headers), I've realized that indeed is encoded as unsigned 8-bit mono at 16 KHz.

Up to my knowledge of this library, if an overrun ever occured, the recorded waves would become chopped up and not swinging between the extreme values. The ISRs are so time-critical that they don't have room for validations; everything has to be assumed because any processing eats up precious CPU time.

If it is a buffer overrun, try lowering the sampling rate (to 8000 Hz) and/or increasing the buffer sizes (up to 254). Keep in mind that increasing the buffer sizes to the maximum possible will make the sketch to take up to almost 3/4 of the available RAM; wouldn't be a problem only if the Arduino is 100% devoted for audio recording/playback (with no extra libraries or modules to interact with, e.g. displays).

If the problem still persists, then definitely is a signal noise (or what MarkT said).
It's somehow mixing with the microphone's output, and so picked up by the amplifier; thus making it clip in the recording.

Hi Mark, thanks a lot for your response.

Here is a schematic of my circuit:

By electrolytic decoupling, you mean a capacitor? If yes, of how many uF? (actually my circuit runs with 5v)

Hmm, interesting, that layout seems pretty good, no ground loop, I guess the 5V rail may be dipping?
I presume the SDcard breakout you have has its own 3.3V regulator?

Yes, try 100uF or more across the supply on the SDcard module itself (decoupling is capacitance,
electrolytic provides bulk values of capacitance rather than dealing with very fast transients, and
is usually what makes a difference to audio.

Also you might want to try adding about 10 to 20 ohms in line with the 5V supply to the microphone
module and add electrolytic decoupling there too.

The SD card breakout uses 5v it doesnt work on 3.3v

I will have to go get the capacitor and resistor and try what u suggest over here..

when am done i will let u know the output.

Hi MarkT,

I'm not sure being clear with the terms: When you say "across" the supply of the SD reader, it means between 3.3V and Ground?

And "in line" means on the 3.3V line wich goes into the mic right?

Otherwise, great news: i made it work once with my arduino UNO. I changed some wires and re-started everything at the beggining, on a breadboard (before, everything was soldered)

But now i try to make it work on a Feather from adafruit and the same old problem comes again, idk why...

Azeez, if you're hearing me, try to do like i did, and also to use two different breadboards for the mic and the sd card reader. It might be a dumb problem like this.

Hillcres-Hellio:
Hi MarkT,

I'm not sure being clear with the terms: When you say "across" the supply of the SD reader, it means between 3.3V and Ground?

yes

And "in line" means on the 3.3V line wich goes into the mic right?

yes

Otherwise, great news: i made it work once with my arduino UNO. I changed some wires and re-started everything at the beggining, on a breadboard (before, everything was soldered)

But now i try to make it work on a Feather from adafruit and the same old problem comes again, idk why...

Azeez, if you're hearing me, try to do like i did, and also to use two different breadboards for the mic and the sd card reader. It might be a dumb problem like this.

Do you have a ground loop? That's not going to work with microphone-level signals, must have star-grounds.

MarkT:
Do you have a ground loop? That's not going to work with microphone-level signals, must have star-grounds.

As on the Uno, there are different ground pins on the Feather 32u4 and the global architecture is the same. You mean there is a star ground on Uno boards and not on Feathers?