Record sound in .wav format with Arduino MKR zero

Are you playing the WAV back on your computer, so you know it's a recording problem, and not a playback problem?

Noise is an analog problem. What's the nature of the noise? If it's not acoustic room noise (which any mic will pick up) then it's probably getting-in from the power supply. You can try powering the microphone module from a battery and if the noise goes away you know it's coming from the power supply.

Or, if you're getting power-line hum/buzz it could be electromagnetic interference from the power lines all-around you. If moving the mic module around, or putting your hand near it changes the hum you may need to mount the mic in a grounded metal case.

Noise should be independent of the signal, so it's more noticeable when there is no signal, or with a very-quiet signal and it usually gets masked (drowned-out) with a strong signal.

Distortion is damage or corruption of the signal and that can be an analog or digital problem. (There is no distortion when there is no signal.)

The low-level is probably normal. I don't see a gain/sensitivity control on the board. And, looking at the datasheet for the mic, it has an upper limit of 120dB SPL, so you won't get "full volume" unless you're in the front row of a rock concert . You could easily be 40dB or more below that so you'll probably need to amplify (digitally*) during recording or after recording.

If you add gain during recording you'll generally need some kind of "VU" meter and if you want the best quality you'll have to leave some headroom to prevent clipping. So depending on what you're doing you may need to amplify after recording.

Of course when you amplify, you'll also amplify the noise...

*Amplification is multiplication... 20dB is a factor of 10 so if you want to increase by +20dB you simply multiply all of the samples by 10. 40dB is a factor of 100. +6dB is a factor of two, so if you left-shift all of the bits in the sample one-place, that's +6dB. (Just remember you're bit-shifting words (24-bits, I believe), not bytes.

There's one important thing to watch-out for when amplifying... If the result exceeds your bit-depth (if you have 24-bit samples and the result is 25-bits) you'll overflow and loose the most-significant bit(s). That creates nasty-nasty distortion (much worse than clipping). The proper way to handle that is to [u]clip[/u] the data (write the maximum value) and then notify the user that it's clipping so they can turn it down or adjust the gain down, etc.

In the real world, most DSP is done in floating-point, which makes all of this a lot easier... Eventually, you need integers for the DAC or "regular" WAV files, but floating-point makes the math/processing easier. (But if you don't have a floating-point processor so you probably don't have the processing power to use floating-point in real-time.)