robtillaart:
OCR is not one algorithm, it is the collection of all image to text algorithms.But there are differences, the very high end OCR recognizes written handwriting (banknotes, signature recognition etc)
high end OCR recognizes printed text in most fonts and sizes.
simple OCR recognizes one type of font in one size.The latter can be done with a MEGA (I programmed a simple OCR long ago on a C64 in Basic so a MEGA must work)
step one: read the image in memory
step two: convert the image to black and white
step three: apply morphological filters to skeletize the text, remove noise, speckles etc
step four: isolate separate characters (and/or words)iterate of the characters
step five: draw bounding box around the char
step six: normalize the bounding box e.g. to a 8x12 bitmap
step six: compare the bitmap with a collection of bitmaps representing chars
step seven: output the most likely characterAnother way is to implement a neural network and train it with characters. That could replace step six.
Not trivial but definitely doable.
Where can I get more information on applying filters to an image with an arduino, isolating characters, creating a bitmap for each character, and so on. Steps 3 through 7, basically. I would like to read a lighted 7 segment display. What are the tools that allow one to trace masks on a portion of the picture for step 6?