Anchun:
Maybe I need to clarify little more here,
The theme of the project is "electric umpire", so the user will throw a ball to a surface,
and the system will be able to tell whether the ball is "safe" or "ball".
To do this, we are setting sensor near the surface, so we know when the ball hit the surface, the time that camera should
be detecting the position of the ball.
So as long as the camera captures that moment when the ball hit the surface, the image processing and all the other processes
doesn't really need to be in "real time", because we are not actually tracking the moving ball from the beginning to the end.
So much simplified description of this would be : ball thrown -> hit the surface -> camera activated to see where the ball hit-> sends
x,y position of the ball so that computer can say this is ball or safe.
This is the processor that I used for similar project. : Intel PXA270
http://www.phytec.com/pdf/datasheets/PXA270_DS.pdf
—32 KB instruction cache
—32 KB data cache
—2 KB “mini” data cache
What we've got? here is failure to communicate.
Lets see, at its slowest speed the processor you used has a cycle speed of 6 times the Arduino, and at the fastest speed 38 times. Because the Arduino is an 8-bit processor, it means it must do multiple instructions where the xscale would be able to do them in one instruction. So figure maybe the xscale is 12 - 60 times faster than the Arduino.
The xscale's memory is 256 kilobytes, the Uno is 2 kilobytes, the Leonardo is 2.5 kilobytes, and the Mega has 8 kilobytes. That means the xcale has between 32 and 128 times the memory of the Arduino.
In addition, the xscale has a camera interface, and has multimedia instructions that are useful for dealing with the image. The Arduino does not have these instructions. In addition, it looks like the xcale has an image sensor builtin .
If you can deal with an image area of 128x96 that is black and white, the UNO would just barely have enough memory to represent one frame. The Video Experimenter Shield (Video Experimenter Projects - nootropic design) would allow you to do some processing. Be sure to watch the Computer Vision video, where it has trouble finding the brightest spot in front of a blank screen.