Go Down

Topic: Printing and understanding international chars (Read 3 times) previous topic - next topic

Abfahrt

I'm trying to make a simple sketch that prints international (more specificaly Greek) characters to the Serial port. I'll modify it to print Greek chars to a LED matrix. Is that even possible from the hardware perspective?

The sketch bellow doesn't work. Is there any known workaround? Do AVRs only support ASCII?

Code: [Select]

void setup() {
 Serial.begin(9600);
}

void loop() {
 Serial.println("[ch947][ch949][ch953][ch945]!");
}

Coding Badly

Quote
I'm trying to make a simple sketch that prints international (more specificaly Greek) characters to the Serial port. I'll modify it to print Greek chars to a LED matrix. Is that even possible from the hardware perspective?

Hard to say without knowing more about your LED matrix.  If it's something you've built then you can print anything you'd like on it.

Quote
The sketch bellow doesn't work.

It what way?

Quote
Do AVRs only support ASCII?

The AVR compilier supports eight-bit characters.  Traditionally, the first 127 characters are ASCII.  Using your example as an example, it's up to the reciever to decide how those eight-bits are interpreted.  

Bear in mind that the "[ch949]" you included in your post is a Unicode character which is 16 bits.  The compiler will store the "[ch949]" in two adjacent bytes.  Serial.println("[ch949]") will send those two bytes to the PC.  Most terminal applications (like Serial Monitor) will display each byte as a single character; eight bytes of "garbage" will be displayed instead of four Greek characters.

robtillaart

#2
Nov 29, 2010, 07:43 pm Last Edit: Nov 29, 2010, 07:53 pm by robtillaart Reason: 1
The serial port of the Arduino can only sent bytes (value 0..255) over the line. The receiving application interpret these byte values - typically as ASCII, however you are free to use another interpretation including Greek characters. The Arduino does not know, it just send bytes.

It's similar to writing a word document, selecting all and change the font to Greek. The (underlying) bytes won't change, but the interpretation and visualization is changed.

Several LCD screens have the option to define your own characterset.  If there are enough free definable chars in the LCD you could define the whole greek alphabet.
If you are sending the data to a PC/Mac, it is up to the receveing app to translate the byte to greek.

A workaround I think of is to place the Unicode (greek - subset) characters in EEPROM and overload the print statement in such way that instead of [byte] it will send EEPROM[byte] and EEPROM[byte+1]. However the receiving app must expect and understand Unicode. As there are 512 bytes in EEPROM there are just enough memory places to do this.

Another workaround is to define two special characters to shift forth and back a fontset. This technique was allready used in Morsecode. If the '>' is sent the next fontset is used and if '<' is sent the previous fontset is used. The receiving app must understand this protocol and it probably only will do this if you write it yourself.  

Hopes this helps.

Rob Tillaart

Nederlandse sectie - http://arduino.cc/forum/index.php/board,77.0.html -
(Please do not PM for private consultancy)

Abfahrt

#3
Nov 29, 2010, 07:48 pm Last Edit: Nov 29, 2010, 07:50 pm by giannoug Reason: 1
Quote
Hard to say without knowing more about your LED matrix.  If it's something you've built then you can print anything you'd like on it.


Its a matrix based on the Holtek HT1632. The Arduino controls it. Forgot to mention!

Quote
It what way?

It prints garbage characters, as you said.

Quote
The AVR compilier supports eight-bit characters.  Traditionally, the first 127 characters are ASCII.  Using your example as an example, it's up to the reciever to decide how those eight-bits are interpreted.


What I want to do is make the Arduino print Greek characters on the matrix. I have a function that reads the "font" from an array and then print the corresponding character on the display. I want to extend this function and make it print more characters. The array contains the printable ASCII characters.

This is what came into my mind after your reply. I'll make another array containing the Unicode characters and modify the function to use that array when needed. Will that work, considering that Unicode chars use 16bits?

EDIT:
Quote
The serial port of the Arduino can only sent bytes (value 0..255) over the line. The receiving application interpret these byte values - typically as ASCII, however you are free to use another interpretation including Greek characters. The Arduino does not know, it just send bytes.

It's similar to writing a word document, selecting all and change the font to Greek. The (underlying) bytes won't change, but the interpretation and visualization is changed.

Several LCD screens have the option to define your own characterset.  If there are enough free definable chars in the LCD you could define the whole greek alphabet.
If you are sending the data to a PC/Mac, it is up to the receveing app to translate the byte to greek.

Hopes this helps.


Thanks! Yes it helped me clear some things :)

VilluV

#4
Nov 29, 2010, 07:58 pm Last Edit: Nov 29, 2010, 07:59 pm by villuv Reason: 1
Unicode, if UTF-8 encoding is used, represents each character using 1..4 bytes.

http://en.wikipedia.org/wiki/UTF-8

But you could try to use iso-8859-7 charset:
http://czyborra.com/charsets/iso8859.html#ISO-8859-7
It is 8 bits per character and it should suit you well if you don't need to use any other "exotic" language simultaneously.


Go Up