Real Time Audio Output, what am I doing wrong?

Thanks to everyone for all the information, I have been studying a lot of the DDS examples I could find and I am definitely playing around with that method of generation as well.

I think it all comes down to if I want to work in integer math, or floating point. I've spent considerable amount of time looking at three solutions; (1) DUE, (2) BeagleBone, (3) STM32F4Discovery.

I also have a Mega2560 but that isn't fast enough for the audio rate stuff I want to do.

Given I already have the DUE, I am still having fun working with it. While my original plan seems to be difficult with it, I am finding I can still do a lot with it and will continue to use it for more simple specific ideas I want to do. Without a doubt, it's the easiest to program as the toolchain is just simple. While 44.1kHz output is possible with careful coding, I'm still concerned with the input ADC time in freerun mode (and how accurate it is with audio) and/or will that even be possible with trying to hit 44.1kHz. If it isn't, then I will look at just using it as more of a sound generation project and just use CV frequency (LFO rate) inputs.

The STM32F4Discovery was recommended by a coworker. Cost wise, it's like $15 and comes with a floating point processor. I've looked at the cycle-times for floating point operation and they are pretty impressive. It runs at about 2x the frequency of the DUE (although not sure the actual processing difference) and it appears the built in ADC is pretty powerful. Also has built in DAC that can theoretically do up to 96kHz. (Cirrus Logic, Inc. | Cirrus Logic) with built in amplifier. The Cortex chip itself has 3 ADC and 2 DAC as well apparently, and they seem to be easily able to handle 44.1k audio rate. In fact, it appears I can multiplex with them and still get 44.1khz out of them, not too bad. The tool chain is a little more work, but given I know someone what already has set this up on OSX, that's a huge advantage. Lastly it has DSP specific instructions like Multiply in one cycle which is pretty awesome.

The BeagleBone is definitely the most powerful of the 3 but the least developed on. I would not use Linux. In all my reading I've done so far, even with all the RT patches it's just not very fast or predictable. That said, the next step is to use STARTERWARE from TI instead. Of course, this will mean I will need a virtual linux image on my MAC to talk with their toolchain (or, until I can find OSX equivalent tools), but, it has the advantage of 700Mhz and of course Floating Point as well. The cape I bought does 2x audio in and 2x audio out at 96kHz 24-bit. This is more than enough for my needs. In addition, there are 7 other analog inputs that can be used for things like CV voltage that I want to read.

It's hard to say there is a winner in all this because if you look at each of these setups, there is a lot of give and take. The DUE has just been a blast to code with. Without any personal hands on experience it seems like the ST is going to really be the sweet spot of price/performance if I want to build a bunch of these, but of course, having all that power on the BeagleBone is tempting.

Duane, your RCarduino stuff is awesome. Thanks for putting the effort forth in sharing all that information. Without a doubt you helped me come up to speed MUCH faster, especially with the interrupt code you pulled together.