Hmm, thanks for your insight. It appears that the Zero optimization is not equal to the AVR or Due level. That's unfortunate because it appears to have a lot of potential for number-crunching (32 bit, 48 mhz, ect).
After running some more tests, the Zero calculates the problem in 3631 microseconds (using sinf, ect), and the Due comes in at 1088 microseconds. I suppose that for the project I'm thinking about, an extra millisecond or two for these calculations isn't going to matter in the end.
One more question, how do I add the "-fsingle-precision-constant" to the arduino compiler? What exactly does that option do?