A kind of intercom between several positions. Each position is equiped with a microphone and a simple keyboard allowing the emitter to choose who he wants to communicate with thanks to one keystroke on the keyboard.
You will get a lousy quality, slightly better than two tin cans and a string.
ADC
The Arduino has an ADC that is 10 bit in the range 0-5V, so you need to amplify the mike a bit. The sample frequency is max 10K, in practice you will get maybe 3-5000 samples of 2 bytes.
"NETWORK"
Not measured, but estimate: 5Kbyte per second = 2500 samples of 2 bytes or 5000 samples of one byte (8 bit)
DAC:
Arduino can do pwm so not even real DAC, suppose it could
So the quality becomes less than an old analog phoneline.
Think you need to consider another architecture. Use the Arduino to control who communicates with who based upon the keypresses but use analog lines for the sound.
Since you want to implement an intercom (implying voice-grade audio), why not use a telephony PCM codec chip? These devices have analog in/out and send/receive an 8K bytes/sec data stream. They compand 11 bit audio down to 8 bits giving decent voice audio quality.
There are hundreds of chips that can do this and they're not expensive. Go to DigiKey and search for PCM Codec.
You could use the Arduino to manage the data streams. Might want to send them using a standard protocol, such as RTP. It's easy to implement.