Downloading Forum Web Pages

Has anyone attempted, successfully, to download a forum webpage? Short topic pages with just a few replies download completely, but much longer topics (many replies) do not completely download, most of the time. For example, the page https://forum.arduino.cc/t/sending-multiple-variable-array-over-rf-link-nrf24l01/405978 will not completely download, regardless of my method of download (eq File > Save Page As > either of the three options) and the browser I use - firefox (default), vivaldi, chrome and even edge. By "most of the time:, I mean that I manage to successfully download some long web pages after much playing around, but the above topic remains stubbornly non downloadable.

If I am at the top of the page and download, the top of the page is downloaded, but then after a few down arrows, the reminder of the page is missing (the final footer is downloaded). On the other hand, scrolling to the very bottom of a topic page and downloading results in the bottom replies only being download (again, with header and footer). To me it appears that the pages are "invisibly" divided into a number of separate pages. In fact, in one instance only, I scrolled down a page and saw something similar to a "display next page" link, but scrolling a little further immediately forced the next part of the page to display.

I've tried many different download add-ons in the various browsers but they all fail. I have also removed many of my security add-ons but pages downloaded are still truncated. Any suggestions?

I tried and used ctrlP under Windows 10 and printed to PDF. It seems to work in both Chrome and Firefox; can't remember for sure if it works 100% reliably bat I did manage to get the topic.

I understand that it's not quite the same as a web page, but it might be an option.

Thanks for the possible solution, I'll see if I can get a pdf version of a printed page. At least I'll be able to save that for later easy reference in my database of "all things arduino bits and pieces".

Just tired this option on the page I mentioned and the result was the same as attempting to download the page - got a few pages of the topic then multiple empty pages.

Damn frustrating.

Apologies, ctrl_P does work, I used File > Print in my previous post when I tested the print solution. Unfortunately code boxes aren't expanded when using ctrl_P, is there a way to expand them before ctrl_P?

You can right click a code block and select Inspect, then in the Styles panel, click the checkmark before the max-height: 500px in the pre code rule to disable the height restriction:
Schermafbeelding 2021-07-30 155747
Finally, use your browser's Print option to print it to PDF. (Don't use Ctrl+P, the forum seems to hijack that shortcut and opens a new window which doesn't have your style change.)

PieterP, thanks for the tip. It certainly expands the code blocks to full size and the scroll bar disappears. When the page is printed to the pdf one observes lines split across two pages (a much discussed issue regarding web page printing). The crtl_P saved pages don't suffer this line-splitting issue.

After applying PieterP's tip to a page, I printed it via the File > Save Page As option and the saved page displayed fully upon reloading (this also worked for a number of the add-ons that save pages). Unfortunately my joy was short lived as another test page didn't save fully. I tried the tip on several other pages and the same problem as discussed in my original post occurred.

The morale is that it seems impossible to get a clean save of a page by one single method.

I have, however, arrived at a method that requires two different saves of the same page, then one can then use the two saves to read the complete article. This has been shown to work on a range of pages (I can't guarantee it will work on all pages, but so far, so good). First, save a page using the ctrl_P method. Rather than using ctrl_P, go to the navigation bar and append "/print" to the end of the page address (eg "Downloading Forum Web Pages - #4 by neilbehere" becomes "https://forum.arduino.cc/t/downloading-forum-web-pages/889774/4/print"). This will immediately bring up the print page dialogue. Save the page to a pdf. Once the print dialogue closes, one is left with a copy of the page with all text formatting removed. Now save this page using a browser add-on or with File > Save Page As. The pdf copy is good for looking for text formatting and reading of the text, but the code blocks are truncated and hyperlinks are not accessible. The other saved copy has no text formatting but the scroll bars on the code blocks work, the code can be copied, and all hyperlinks work. Obviously, the two saves together let you do things that only one save can't (mainly, find a hyperlink, pdf save, and then follow the link to a new page, the second save).

Complicated, but at least you can save a page. The bummer is that the forum software gets upset if you save too many pages in succession and blocks saving more pages for some proscribed time. I don't know the exact trigger (number of page prints over some time interval) for this blocking or the time out you are forced to endure.

This topic was automatically closed 120 days after the last reply. New replies are no longer allowed.