The usual answer is the Arduino is too slow and too memory limited to be able to do this processing in real time.
I suspect the Arm embedded processors (Teensy 3.0, Due, Mbed, etc.) are more optimized for low power situations. They are presumably faster than the Arduino (particularly at processing 16/32-bit data), but they may not be fast enough or have enough memory to hold a complete image in memory. Perhaps something like a Beaglebone (500-700Mhz depending on power issues) or Rasberry Pi (700Mhz) would give you the additional speed and memory to do what you want.