Hello folks! Good $time.
I’m writing a serial monitor program and my intention is to add options to choose among different encodings.
I’ve spent some time trying to understand what’s default Arduino character encoding. After searching on forum/google and trying with Arduino my conclusion is that information in this matter is generally misleading/imprecise. This way I decided to post this topic in the hope to confirm if some of my conclusions are correct or not.
Recapping from Arduino pages:
“The char datatype is a signed type, meaning that it encodes numbers from -128 to 127.” https://www.arduino.cc/en/Reference/Char
Arduino reference pages (Serial.print and others), indicates that the encoding used in sketches and by Monitor Serial is ASCII. Arduino - ASCIIchart
Sure. From int 32 to 126 it’s really like ASCII but overall, compiler and Serial Monitor encoding is something else.
If I do
for(byte r=32;r<256;r++) Serial.print((char)r);
for(byte r=32;r<256;r++) Serial.write(r);
for(byte r=32;r<256;r++) Serial.write((char)r);
The output in Serial Monitor is strictly what’s expected for cp1252 encoding (Windows-1252).
I think it sounds obvious but considering I didn’t find this information anywhere my real question is if it’s correct that I take these assumptions:
1- When I cast anything byte/int (0 - 255) to (char) what compiler really does is apply cp1252 encoding.
2- Arduino’s Serial Monitor encoding is configured internally for cp1252.
Are these assumptions correct?
If so it looks a good deal updating the reference pages to reflect this.
All characters in the ASCII range are represented also in ISO-8859-1
All characters in the ISO-8859-1 range are represented also in cp1252
Not the most correct term bu they are “backward compatible” and cp1252 the most complete (in terms of unique character representations), among the three.