Telephony-grade audio has a bandwidth of 3kHz.
Assume no coding, you're going to need 6000 samples per second (Thanks to Mr Nyquist).
Even if they're only low-quality 8 bit samples, that's still nearly 6kbytes per second.
You may have some spare processing capacity for compression/decompression, though generally, the higher the compression, the more processing you need to do.
Do you need to record and playback on the same device?