The only extra hardware you would need is a webcam. Finding/recognizing the page numbers isn't the most complex task you can do with the tools like OpenCV.
The problem with that is that for proper OCR you need a resolution of 600 dpi and you are unlikely to get that with a web cam.
I have a 300 dpi scanning of some text I wrote a long time ago and it was published in a magazine. I have some OCR software provided by Epson when I got a scanner / printer from them. No doubt it is not the best in the world but a lot better than stuff I have had in the past. It has a great deal of difficulty recognising the text and it is full of errors of a 330 dpi scan. If I scan at 600 dpi I get 100% error free recognition.This is just my experience but I am sure it is not atypical.