I'm not an audio expert either. For a low cost speaker with enclosure, you might look at the X-mini II speaker system. It sells on eBay for about $10 USD delivered. I am using one on a project now.
My circuit already has an audio amplifier, so I removed the internal amplifier and rechargeable battery and connected it directly to the 4 ohm 40mm speaker.
These speakers sound real nice http://www.mpja.com/4-Ohm-Mini-Speaker/productinfo/14618%20SP
I drive them with a N-channel MOSFET and 12V source - high sensitivity, so it's loud too!
If you're doing something other than taking Arduino tone() output louder, will probably want to bias it differently than I did.
The easiest thing to use would be any computer speaker. Computer speakers are active (aka "powered") so you don't need a separate amplifier.
that will play back the human voice as realistically as possible.
That could mean different things to different people... Regular-cheapo computer speakers may be good enough, or you might want some nice Hi-Fi speakers and an amplifier for very-realistic high-quality sound.
These speakers sound real nice http://www.mpja.com/4-Ohm-Mini-Speaker/productinfo/14618%20SP
I drive them with a N-channel MOSFET and 12V source - high sensitivity, so it's loud too!
If you're doing something other than taking Arduino tone() output louder, will probably want to bias it differently than I did.
No! That won't give you good quality... A single MOSFET is not a linear amplifier. It will work fine with PWM but it won't work with true analog audio from a WAVE shield.
Having more than 5V and a 4 ohm speaker is usually necessary to achieve
higher power.
P=IV = V^2/R
5V5V/4ohm = 6.25W.
Having a low Rds MOSFET is the only way to actually get 5V across the
speaker, so actual power out is usually less.
12V12V/4ohm = 36W Big increase!