DIY large ebook reader

As I need to read a lot of academic papers for work, I thought about getting a large size ebook reader. Since prices are rather insane ($600 min) I am thinking about building my own ereader.

I can get 3mm thick 5000mAh LiPo batteries for cheap with integrated BMC while for the processing I’d use a Raspberry Pi zero W or an Arduino MKR.
Screen choice is much more difficult as pure e-ink 10-12” displays often have a poor resolution which is quite detrimental when dealing with displaying graphs, formulas and pictures. Waveshare is the most famous e-ink display producer but their largest model only features a 1304×984 resolution.

Has anyone tried to build their own? Is it possible? I’ve just seen a few projects involving arduino uno based 3inch screen “portable ereader” and nothing more.

I recently bought a 10" tablet computer for £120 in the UK, not sure how that translates to your currency but I don’t imagine it’s $600. Very easy to read books and magazines on it.

Can’t see you’d successfully build a device comparable to a mass produced device for less money.
If it’s for work get the company to pay for it or you won’t be able to read anything at all !

Commercial products are highly priced due to the touchscreen display that I don’t need.

SeaWalker:
Commercial products are highly priced due to the touchscreen display that I don’t need.

I'm waiting impatiently for an M5Paper (back-ordered through DigiKey with a four-week manufacturer's standard lead-time... grrr.. arg) to arrive, and staring at Waveshare displays and various ESP32 boards in the meantime.

I've thought frequently about building a reader for years - the commercially available options, as you say, are outrageously expensive, and they're also all compromises since you're stuck with the manufacturer's software.
The M5Paper is a neat little device, that will make for a great pocket-sized e-reader, but it's also tiny. Though terrifically high resolution (960x540), the screen measures only 4.7" diagonally. Great for packing full of fiction... less so for technical .PDF's and academic papers.

So as I wait for the pocket reader to arrive, I'm planning to build it a companion. Any of the Waveshare screens with IT8951 controllers that support partial refresh should work nicely -- and will be, of course, the most expensive part. The one I keep going back to is this one: https://www.waveshare.com/7.8inch-e-paper-hat.htm 7.8", 1872x1404 4-bits per pixel. For a good low-powered and sufficiently capable board, I want to stick with an ESP-32 with psram, the same as the M5Paper because of the Ultra-Low-Power deep-sleep capability - I'll probably go with a Lolin D32 Pro because it has a Micro-SD slot, psram, lots of available GPIOs, and no power wasting LED lit when running from battery.

Now an ESP-32 isn't going to be rendering .PDFs by itself, and even .EPUB and similar is probably a bit ambitious, but that's fine -- I'm planning to use .PDF as an intermediate format on my computer, and convert to 1 and 4 bit per pixel run-length encoded at the pixel and line level for the reader, nicely scaled and with margins trimmed. Of course a plain-text/markdown viewer should be included as well, but being able to view converted technical .PDFs is what would justify the cost of that screen.

Three buttons aught to be sufficient for an interface - up, down, and a menu/select button. Without a touchscreen to worry about, I'll probably go with a nice thick piece of plexiglass to protect the beautiful but expensive epaper display, and MIGHT go with a solar panel on the back (since the device would spend most of its time in deep sleep, waking only to refresh the screen with the next page before returning to its slumber, a solar panel roughly the same size as the screen should be more than sufficient to support it).

Finally, I'd like to include a PS/2 keyboard port in the build -- a cheap addition that means the device could double as a text-editor/word-processor or even as a serial terminal.

Interesting hobby project but why reinvent the wheel? Get a cheap s/h laptop running windows, install a free ebook reader (i use fbreader) job done; or as Perry says, get a 10" tablet.

The only downside is you dont get an e-ink screen. However e-ink screens are expensive, small, slow to update, ...

OK the laptop format isnt great for ebooks, but you can display as 2 pages wide

eInk screens are "slow to update" compared to a computer screen, but of course they're not normally used for purposes that require fast updates like playing videos. A second or two for a full screen refresh when turning the page in a book is fine - and since the image once set needs no power to maintain it, your controller can shut down or go into deep sleep while you read the page, only needing to boot or wake when you press a button to turn the page. And any screen and controller that supports partial updates can be quite fast and responsive - no need for long waits for menus and such to be drawn. The downside of this is that you'll get 'ghosting' in the images unless you do a full clear and refresh of the screen every now and then, which is fine for UI menus and such when fast is more important than pretty. The important part is that you can get absolutely astounding grey-scale image quality on a static display - even if you have to wait through a couple seconds of refreshing for it to get there, and there's no power consumed to maintain it - unplug the controller and throw it in a frame on the wall if you'd like - it'll still show the same image ten years from now as long as you protect it from too much UV exposure (so not in direct sunlight). And to take advantage of such a screen, you don't want a computer attached to it running the whole time waiting to respond to the user hitting the page-turn button - you want a microcontroller that's either powered on by the button or woken from a deep sleep, running just a bit of C code that refreshes the screen and then powers off or returns to sleep - which means talking months of battery life is quite realistic, and with an appropriately sized solar panel on the back, you could have a reader that would likely never need to be plugged in.

And yes... we all know that you can read .PDFs on a computer or tablet. This is not news. It's safe to say that the original poster HAS a computer and reads .PDFs on it, as do I... but the difference between epaper and an LCD display is basically the difference between printed text in a book and a picture of the same page on a TV screen, and I'm lucky to get four hours of life out of my battery if I unplug this thing -- even my phone, which of course has too small of a screen to read double-column .PDFs on, would only last perhaps six or so hours of use with the screen on the whole time being used as an ereader... that's just not good enough... oh, of course it's good enough if it's all you've got, but the point is that with some off-the shelf parts and rather straight-forward coding, something could be produced that's a whole lot better.

The competition is the printed page, not the laptop - yes it's faster to turn a physical page than to wait for the e-ink display to refresh, but given the weight difference between the number of texts that could fit on a micro-SD card and the number of printed volumes that will fit in a room (or more importantly, a backpack), I'd say e-ink wins - but not if the device needs to be plugged in constantly to recharge -- and that's where a DIY reader wins compared to the expensive e-ink tablets on the market because they are e-ink TABLETS - full computers running full OS's with fancy multi-touch touchscreens, gigs of ram, and all sorts of fancy features - that mean that while they'll still get far more runtime out of a battery charge compared to a traditional tablet with a backlit LCD screen, you'll still only get, maybe, days out of a charge -- not weeks or months.

So I now have an actual reader build underway, to be "finished" (except for building a case for it) in a month or so when UPS brings me my Waveshare display (I went with the 10.3" 1872x1404 'flexible' display with IT8951 HAT).

The rest of the hardware consists of an ESP-32 with 4MB flash and 4MB PSRAM, a "SHARP Memory Display" (very low power, monochrome, 400x240, 2.7" diagonal, reflective/daylight readable like e-ink), one 512KB SPI FRAM breakout, and two push-button knob rotary encoders. (plus, of course a battery and SD card).

The E-Ink display is to be used only as a page canvas, while status information, menus, and other UI elements will all be placed on the little "memory display" placed beneath it. The ESP-32 will spend most of its time in deep sleep - or shut off entirely by a toggle-switch on the battery line - rather than relying (exclusively) on RTC Memory, it will make use of the FRAM to store the current file and page number as well as last page viewed for other recent files, etc. Thus waking from a cold-boot and waking from deep-sleep will behave the same. The two knobs will be positioned on each side of the little screen, the right hand knob will usually act as a page turn/page select control (pressing the knob brings up the current page number on the mini-display and you can spin 'forward'/'backward' to the page number you want to jump to, pressing the knob again to select) - most of the time this will be the knob that wakes the controller with a page-up/page-down event for which it will update the display and page-number in FRAM before returning to sleep. The left-hand knob will usually be a file-selection knob rotate to scan through the names of recently opened documents, press to select, or just press to browse directories and select documents within (information displayed will be from associated "card" files rather than path names in order to display nice title,author,date, etc. information rather than just filenames while browsing the library).

All of the hardware (but for the display on the way) is wired, tested, and working happily together, and I've now been focusing on the software side for the past few days. The native file format will be, essentially, .PDF --- converted by a python script that handles cropping and sizing, converts to 4-bit per pixel and then compresses it via a run-length encoding scheme.

The file stored on the SD will consist of a header block consisting of a null-terminated list of byte sizes for pages followed by the non-terminated concatenation of compressed pages - on first loading, the reader scan the header, adding up the sizes until it has counted to the page number it wants and then scan to the end of the index to find the base. Adding the size total to the base gives the offset to seek to to begin reading the compressed page -- ALL pages have the same width and height (1404x1872), and so the run-length decoder just scans and fills in pixels in the display buffer until it reaches and fills buffer[1404*1872] - no end of page terminator and also, I suppose, no fault tolerance in case of a cosmic ray flipping a bit somewhere, but it will both offer good-enough compression (should be able to fit a few thousand books or a few tens or hundreds of thousands of academic papers on the 32GB SD card I'm using), and extremely simple decoding easy to implement for the little micro-controller and with minimal overhead.

A serial shell will both allow for file transfer and management using python scripts on my laptop, and will offer real-time display commands -- it will easily act as an attached USB e-paper display that scripts can use for real-time information displays, for sending plots to from a Jupyter notebook, to display a "NOTES" file that can be edited and
updated as an extra memory besides paper and pencil while working on stuff... etc.. etc.

In my initial testing, I'm VERY pleased with the battery life with the esp32 running continuously, updating the display with encoder pin states, voltage, etc. on the little 1200mAh battery I got just to test/prototype with... I'd planned to order the largest battery that would fit in the case whenever I get to the case-building point, but my test code ran for well over six hours bringing the voltage from ~4.16 down to ~3.86 without ever going into sleep... it's a 3.7V battery... so with the device in deep sleep (or off) most of the time between briefly waking to turn the page, the battery I've got should actually provide days if not weeks of charge before it reaches a conservative low cut-off like 3V... and... shoot... it weighs nothing... and since I was chicken and paid more for the flexible screen because I had images of the glass e-ink panels shattering in eventual drops, the end build should be remarkably light which is a nice surprise since that wasn't a feature I was trying to optimize for.

Just 'finished' a tool that others mixing micro-controllers and ePaper might find useful:
GitHub - stevenaleach/PDFto4BC: A small command line tool and Python library for scaling, cropping, and converting .PDF documents to a 16-shade (e-ink targeted) pixmap container format at target screen resolution along with metadata and plain-text that is designed to be easily handled by microcontrollers. See: https://github.com/stevenaleach/PDFto4BC/blob/main/4BC.ipynb It.s a Python tool to convert .PDF files into a form more friendly to low power low memory devices. Pages are run-length encoded pixel-pairs (two pixels per byte) and the compression seems to average at about 14% of what the raw (1 byte per pixel) pixmaps would take up. I'm converting to 1404x1872 because that's the size of the Waveshare display I'm waiting on. The first full book I converted was Stephen Levy's 'Hackers' - 520 pages that would occupy 1.366 GB as uncompressed pixmaps or 683.3MB as packed pixels, two per byte -- the run-length compression brings this down to ~170MB with no cropping applied - 215MB with nice cropping to nearly eliminate margins. Not too bad - at that ratio I can store nearly 2,500 pages per GB - and some texts compress better - I've been testing on several .PDFs of Creative Commons licensed books, and when I ran the file I have for Charles Stross's book 'Accelerando' (819 pages), it came out at 186MB -- or 4,500 pages per GB.... fonts matter... text with thicker and softer fonts will compress less than those with thin and sharp edged fonts... and of course images don't compress well (the cover of Hackers, for instance squeezes down to about 34% of the 'raw'' size while the pages are about 14%).