Go Down

Topic: openGLCD and extended ASCII danish letters ... (Read 493 times) previous topic - next topic

skovholm

HI guys

i try to write a string to the grapich display using openGLCD library ..

char aString[] = "Æblegrød";
for (int i = 0; i < sizeof(aString) - 1; i++){
   unsigned char tegn = aString;
   GLCD.write(tegn);
}

The font i do use contain the 197,198 and 216 ascii letter which are the danish æøå...

BUT i can not figure out how to have the display showing these extended letters ???

it seems like an extended letter contains of more than one ascii value ????

who have a good idea

Hjalmar

skovholm

HI

#include <openGLCD.h>
void setup() {
  GLCD.Init();
  GLCD.SelectFont(hjh);
}

void loop() {
  for (int i=1;i<256;i++){
  GLCD.write(i);  /// works perfect ;o)
  }
  delay(1500);
 
  char aString[] = "Æøå";
  for (int i = 0; i < sizeof(aString) - 1; i++){
   unsigned char tegn = aString;
   GLCD.write(tegn); /// make some strange letters
   delay(1500);
}
}

here you can see a code snippet / a test program .....why do the first works but not the next ?

Hjalmar

bperrybap

I can't really try it since I don't have your font.
But I'm  guessing the problem is here:

Code: [Select]
unsigned char tegn = aString;

This is  assigning the lower 8 bits of the aString[] array base address
to the variable tegn. Then, just below it you are trying to
print those 8 bits as a character.

What you probably meant to do was index into the array and pull out the character:
Code: [Select]
unsigned char tegn = aString[i];

But rather than loop and pull out the character 1 by 1, why not use the print() function?
Was it not working?
i.e.
Code: [Select]
GLCD.print(aString);
or directly as a literal:
Code: [Select]
GLCD.print("Æøå");
or you can print the individual characters:
Code: [Select]
GLCD.print('Æ');

--- bill


skovholm

Hi BILL

Thanks for your reply

I will test your suggestions in 5 hours but i am sure it will not work

attached my font file so you can test ;o)

i really appriciate that you do help..

Hjalmar

skovholm

hi Bill

Now testet and

char aString[] = "Æa";
for (int i = 0; i < sizeof(aString) - 1; i++){
   GLCD.print(aString);
   GLCD.print("Æøå");
   GLCD.print('Æ');
   delay(500);
}

dont work at all....none of the letters come on the display .....except the a in the string..

Please try to load the font and do the test yourself ... I think it is something with your wonderfull library ;o)

Hjalmar

bperrybap

I haven't tried the font yet. I'll try it later today when I have some time.
It looks like there is some sort of character encoding issue.
I'll have to read up on UTF-8 encoding and how it works in gcc to see if there is
something I can do in the library to support non ascii character encodings or
whether the compiler alters the data at compile time.

Here are a few things I found by searching:
http://www.forward.com.au/pfod/ArduinoProgramming/Languages/index.html
http://www.visualmicro.com/page/User-Guide.aspx?doc=Non-ASCII.html

I've not ever worked with non ASCII encoded characters so
I'll need to dig much deeper to understand how it works to see
if it is possible to add some code to the library to support it.

--- bill

skovholm

HI Bill

Sounds great that you look into it ....

there have to be a way because that GLCD.write(197); works fine....

I do understand that everything above asccii 128 use 2 byte and thats a part of the problem...

utf-8 will work because it also have the danish letters ;o)

Hope you can fix it in the library ...

I will continue to develop my application with use of your great library

have a great day over there

Hjalmar

bperrybap

The library code always uses a single byte index to lookup the character glyph.
UTF8 encoding actually supports 1,000s of characters using an 8 bit multi-byte sequence.
Codes above 0x7f use "at least" one extra byte. But it can be up to 4 depending on the size
of the code.

I have added support for UTF8 character decoding.
It is a library configuration option and will be disabled by default for full backward compatibility.
(but I might consider the other way around depending on user feedback)

The issue is that you can't really do both UTF8 decoding and support raw character codes above 0x7f.
Also, supporting character codes beyond 255 is not supported with the current font format and that is a
very big deal to change since it affects so many things.

Since codes larger than 255 can't currently be supported, as a "feature" when UTF8 encoding is enabled
you can still send raw character codes between 0x80 and 0xff.
The only limitation is that  you can't send the raw codes
0xc2 and 0xc3 as those are used for the UTF8 encoding start markers and will trigger a UTF8 2 character decode sequence.

I have it working; however, I have found a font rendering issue that needs to be corrected.
The font you created is a variable font that is 15 high and for some reason the font rendering code is getting confused
with this height when the characters land on  y pixel values that are not multiples of 8.
The rendering code works for height of 14 and 16 and with a fixed width font with a height of 15 as well.
I'll have to track down what is happening with that font.
It may be a few days.

If you want to see the rendering issue, you can try this:
Code: [Select]
GLCD.CursorToXY(1,1);
GLCD.print("Hello");


Normal text row processing seems to render ok as for this sized font
it will land on a 8 pixel boundary.

I have committed the code to the openGLCD bitbucket git repo but have not made a down-loadable release.
If you want to play with it in the mean time you can grab the code from master branch in the repo.


--- bill

bperrybap

I've published a new release that should work for you.
It adds UTF8 decoding and fixes the rendering issue.
You will have to turn on UTF8 decoding in the openGLCD config file.
The repo wiki and the distributions now contain a changelog.

Let me know if you have any issues.

--- bill

skovholm

Hi Bill

Now i did install the new version but it did not help me ...

Do you test with my font file or did you make a utf-8 font file ...if yes how did you make that ?

Sorry to disturb you so much ;o)

Hjalmar

bperrybap

I used your exact font file and it works for me.
There is one issue with respect to WriteString() but it only relates to alignment.
(glcdfmt_right is broken for utf8 character strings - it will be fixed soon)
This issue will not affect normal printing.

What exactly were you doing and what wasn't working?

Keep in mind that there are several limitations.
You will be able to print literal strings that contain UTF8 characters.
i.e "Æøå"
You will not be able to print literal UTF8 characters directly.
i.e. 'Æ';
You will not be able to index into a string or treat it as array of characters if it contains
any UTF8 characters.


This is because of the way the compiler and some of the other libraries work.
The compiler will use the multi byte code when using UTF8 characters.
The rule for UTF8 is that anything above 0x7f gets multi-byte encoded.
For a string this results in there being more actual bytes in the string data than characters.
This works as long as the string processing routines understand UTF8 decoding.
The code I've added to openGLCD will process the UTF8 decoding so it will print the characters
correctly as long as openGLCD receives all the utf8 data.
Libraries like the Print class ignore the actual data in strings so openGLCD gets the full string and it all works.

For things like a literal character, the compiler creates a 16 bit "character" value as an integer
instead of an 8 bit character value.
This confuses the Print class library since it assumed nothing but 8 bit characters and
will interpret the 16 bit integer value as being a number rather than a character.
so If you call something like
GLCD.print('Æ');
The 'Æ' has a UTF8 value of 0xc386 which is what the compiler will pass to the Print class print() function as an integer rather
than a character.
The Print class will assume you are trying to print a 16 bit number and will convert it to an ASCII string  of digit characters
which it hands character by character down to openGLCD.
openGLCD never sees the original 16 bit data.

If you use the xxprintf() code as an attempt to skip over the Print class integer processing by
using xxprintf() to create a string and then hand a string to opgnGLCD, it doesn't
work either.
So something like
GLCD.Printf("%c", 'Æ');
ends up printing the wrong character because the xxprintf() code in the AVR library also does not understand
UTF8 encodings so while all 16 bits (0xc386) got handed to the xxprintf() code, it thew away the upper 8 bits and printed
the wrong character code (0x86) instead of the proper decoded character code (0xc6).

You also can't cast the value to try to force it to be treated as a character rather than an integer.
i.e.
GLCD.print((char)'Æ');
While that will cause it to be treated as a character, only the lower 8 bits will be used and so
0x86 is handed down to openGLCD rather than the true decoded character code of 0xc6


You also have to be careful if you attempt to assign the literal to a char type like:
char c = 'Æ';
As that will only grab the lower 8 bits of the 16 bit UTF8 code.
So the character value in that case would be 0x86 instead of 0xc6.

There isn't much else I can do in the library other than offer a function or macro to do the character
UTF8 character decoding for convenience.

i.e. something like:

char c = GLCDUTF8('Æ');
or
char c = GLCD.utf8CharDecode('Æ');
GLCD.print(c);

You could also then do something like
GLCD.print(GLCDUTF8('Æ'));
Of course you can still send the actual character code value:
GLCD.print(0xc6); // print Æ


I can't  re-define the write() function in openGLCD to take a 16 bit value instead of an 8 bit value
which would then allow either 8 bit or 16 bit characters so that you could do:
GLCD.write('Æ');
because the write() function is virtual and is declared as using a 8 bit value
in the Print class which comes with the IDE.

I could add a writeUTF8() function?
GLCD.writeUTF8('Æ');

All the solutions are pretty ugly.

I think that there are some gcc options to alter how extended character sets treat characters.
I'm guessing that those will also result in a mix bag in that while it may solve a few issues, it will
probably create others.
And then, there is still the issue of the interface in side Print as being unsigned 8 bit values
so you can't ever make it fully transparent and "just work" for all cases.


--- bill



bperrybap

so I think this is where I'm going to go.
I'll update the library to add a
writeUTF8() function that can be used to send utf8 characters.
I'll also update the PutChar() function to transparently support  utf8 characters as well.

These will be enabled when UTF8 support is turned on in the openGLCD config file.
These will be in addition to supporting UTF8 characters in C strings with the limitations as described earlier.

This will then allow things like:
GLCD.writeUTF8('Æ');
and
GLCD.PutChar('Æ');

That is about the best that can be done.

--- bill



Light83

Hello,
thank you for you library.
Can you please help me print 2-byte UTF-8 characters? Chars located starting from 1000 index.

Go Up
 


Please enter a valid email to subscribe

Confirm your email address

We need to confirm your email address.
To complete the subscription, please click the link in the email we just sent you.

Thank you for subscribing!

Arduino
via Egeo 16
Torino, 10131
Italy