DSP is literally "digital signal processing" and is what you do with sounds when you process them in the Arduino (or any other digital chip).
Old-school voice scramblers work on a side-band signal principle. When you ring modulate a signal with frequency A with a signal with frequency B, you generate two signals shifted in frequency, A+B, and A-B. if you make sure that they separate enough, you can notch filter out to get only one of those signals out, and send that through the wire. You then decode by doing the same thing again, and notching out the original side-band. You can do this in many separate bands, similar to a music vocoder effect unit, if you want to make un-scrambling somewhat harder.
I believe there's a separate related method where you can use aliasing to "mirror" the frequency spectrum of a signal, and transmit the mirrored/inverted signal, rather than the original signal. Low frequencies become high, and vice versa. However, I don't know how to implement this in practice, so I can't give more help there.
If you want to make a signal actually secure against eavesdropping, then the "scrambler" approaches don't work; they are trivial to attack. But I imagine you're not looking for cryptographic strength security, but rather than old-school spy movie sound :-)
So, how do you get the signal into the Arduino so you can run the necessary math (DSP) to scramble it? You probably buy a ADC/DAC chip that can talk SPI or I2C, and run it at some slow speed, such as 8 bit mono at 8 kHz sampling frequency, which the Arduino can keep up with. Note that this will still generate 8 interrupts per millisecond, so you'll have to be pretty efficient at whatever processing you're doing.
This is the cheapest I could find with a quick search -- note that all these things are surface-mount these days, so an easy hook-up to a breadboard is not in the cards: