Arduino Uno R4 WiFi Web Client SSL Example - HTML/JS Syntax from Web Server

Hello,

To preface, I'm a beginner for networking related topics. I tried out the Web Client SSL example here: https://docs.arduino.cc/tutorials/uno-r4-wifi/wifi-examples/

I am trying to understand the HTML/JS syntax that I'm getting back from Google. If I drop it into VScode and save it as a .html to try to "recreate" the webpage, it shows me 1000+ JavaScript syntax errors, and the HTML preview accordingly reflects some formatting issues.

I don't have much experience with HTML/JS at this point, but I hope to learn more in the near future. Can anyone shed some light on what I'm seeing - am I making any mistaken assumptions for how to approach "recreating" the webpage, or perhaps any constraints of the communication in this example which could yield the syntax errors?

Thank you!

Trying to decipher the Google home page is a bad place to start. It gets served about a billion times a week, so it is squished as small as possible, making it hard to understand. Plus it has dozens of dependencies, including lots of squished JavaScript.

The Web Client SSL example is more about proving that the R4 can contact a web server using the https protocol, which has complications (and benefits) that the plain http protocol does not. Doing so on an R4 is generally easier than an ESP32 for example. (Don't be confused by the fact that an R4 has an ESP32 to do WiFi -- the R4 uses a different chip as its main CPU.)

You can learn about HTML and JavaScript without regard to Arduinos, and it is easier to do so that way. In addition to any particulars about https, which you don't have to worry about when using a regular desktop (or mobile) browser, the other thing you might do on an Arduino is to "perform HTTP manually" -- in the example, stuff like

    // Make a HTTP request:
    client.println("GET / HTTP/1.1");
    client.println("Host: www.google.com");
    client.println("Connection: close");
    client.println();

-- which is also something the browser does for you. There are HTTP clients that are layered on top of the WiFi client that will do that for you, but that's another bit of code that takes space and you have to learn to use; sometimes it's not necessary.

Sometimes an Arduino is used as a web server, which is common enough on a desktop (or in the cloud); or as a WiFi access point, with a captive portal, which is not common at all.

Do you have a particular project in mind?

Thanks for the reply!

It's certainly not my objective to decipher Google's home page - I 100% agree that learning HTML and JS is a separate activity from Arduino.

My primary objective is simply to confirm that the example project functioned properly and actually retrieved the web page in totality. I was trying to verify this by extracting the HTML portion of the serial monitor output and putting that into an HTML file to preview, however VScode flagged a high volume of syntax errors. I am hoping to understand whether any characters actually got lost in communication somehow (seems less likely), or otherwise if any other particular step of the example project and/or my post-processing would account for the syntax errors (seems more likely). I'm of course highly confident that Google works properly as-is!

To add to my original post, I have captured some of the syntax errors from VScode in the screenshot below - it is expecting ' at the end of line 8, and it also takes issue with all of the , in the subsequent lines. I tried to compare with Google's HTML on my desktop using Chrome's developer tools, but it looked fundamentally different - if I'm perhaps getting a mobile version of the webpage over the ESP32, I haven't yet found anywhere to view the mobile source on a browser to compare.

Barring any immediate insights from this forum, I will probably be doing more research into how a browser would handle a similar request, for comparison. In the meantime, I'm hoping that someone might have insight into why these syntax errors are present.

I'm not working on a specific project right now - eventually I might try to do a some kind of web server application for home automation, but for now I'm just trying to learn some basics about networking and get some experience with the related Arduino libraries.

Thank you!

An update - I removed the code from the Arduino example project which wraps the serial output at 80 columns (adding extra printlns - definitely interfering with the syntax). The HTML preview looks much better now:

There are still however 41 errors logged in VScode, mostly for css identifier/term/property value/{ expected. I may have to do a little more reading to try to understand the impact of those errors (are they "real"?). I'll share the corresponding html and error log info in text files here in case anyone is interested in taking a look.

putty_output2_txtfile.txt (52.9 KB)
ErrorLog_putty_output2.txt (12.9 KB)

putty.output2_txtfile.txt: Line 4, near character 1970

DXImageTransform.Microsoft.Blur(pixelradius=5);*opacity:1;*top:-2px;*left:-5px;*right:5px;*bottom:4px;

*opacity, *top, etc. do not appear to be legal identifers but maybe this works on MS browsers. Why google.com returns MS specific HTML when the http request is from an Arduino is a mystery to me. Unfortunately, big tech servers may return different HTML depending requesting browser, OS, source IP, etc.

For example, when connecting from Firefox running on Ubuntu in the US, the google page returned does not include the Microsoft.Blur stuff. In FireFox, File | Save Page As saves the HTML to a file named Google.html. There are other differences as well.

That's an interesting observation, thanks for pointing it out! From what I can see online, that DXImageTransform.Microsoft is also deprecated as of IE9, hmm...

Some of the other CSS errors don't seem to have any clear browser-specific content in the immediate vicinity, at first glance. I guess I might have some reading to do if there are more specific HTML compatibility scenarios to consider, and whether my error checking in VScode could perhaps not be aligned with any alternate rules that Google could be using.

An easy way to download the HTML response is with telnet (on Linux and Mac, or with WSL on Windows)

$ telnet www.google.com 80 > google.html

You won't see anything after pressing Enter. Type the HTTP request. One trick -- which also works when performing HTTP with Arduino -- is to use HTTP 1.0. You don't have to manually close the connection -- 1.0 does not support keep-alive, and you don't need a Host header. There's also no Transfer-Encoding: chunked for the response, so no chunk sizes. Just the request-line followed by a blank line

GET / HTTP/1.0

You should get Connection closed by foreign host. and the shell prompt. Edit the file, and remove the response headers, everything before the <!doctype html

I get 55,070 characters, but only 59 lines. Doing a Format Document in VSCode yields 1,679 lines. All the errors are CSS errors. The property names starting with an asterisk is apparently an IE7 hack. I gave up trying to find an explanation for the slash before semicolons for property values.

It's common to have a single set of HTML and CSS for with all the browser-specific hacks. Each browser would honor its own stuff and ignore the others. (Note that the request had no User-Agent header.) Browsers are better now, so don't follow those 20-year-old hacks. Stick to standard stuff.