troubles with special characters

hello everyone !

I need to use some special characters in my program

when I code :

    Serial.println("ᛉ");

it prints ᛉ in serial with no trouble

But if I code :

  char x = "ᛞ";
  Serial.println(x);

it prints a "space"

  char x = 'ᛞ';
  Serial.println(x);

it prints : ⸮

  unsigned char x = 'ᛞ';
  Serial.print(x);

it prints : 158

  unsigned char x = "ᛞ";
  Serial.print(x);

it prints : 18

well... nothing I tried worked...

there must be a way to do it, probably with the proper data type, or something like "HEX", "BIN" after a number...

If anyone has the solution, that would be great !

What is the ASCII value of the character you are trying to print?

Trying to store a string (that's what double quotes creates) in a char is pointless.

ok I found how to do it :

  char* x = "ᛞ";
  Serial.println(x);

If I use single quotes instead of double ones it prints ⸮⸮wm⸮⸮8⸮⸮⸮N⸮⸮⸮;⸮⸮⸮O\⸮{⦅⸮⸮F⸮⸮⸮⸮ו⸮⸮⸮w⸮⸮ϻ⸮⸮⸮⸮=⸮⸮=u⸮⸮r⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮i⸮⸮{⸮⸮3⸮"⸮k⸮⸮⸮⸮⸮?۷9⸮k⸮⸮⸮⸮⸮⸮⸮^⸮[u⸮⸮⸮⸮⸮⸮Ou⸮⸮{⸮⸮⸮⸮⸮jQz⸮ߡ⸮⸮⸮yO⸮⸮~⸮⸮⸮߮⸮⸮.⸮e⸮g⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮{⸮߃֒ﯭ⸮⸮⸮߿⸮-Ǫ/⸮{n)⸮⸮x⸮⸮c}y<_⸮⸮⸮o⸮⸮⸮⸮⸮⸮~;~?⸮⸮ʣ⸮⸮Q⸮{⸮⸮⸮^⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮⸮ݙ⸮⸮⸮\W⸮⸮?^e⸮⸮l⸮⸮-⸮⸮⸮⸮q]⸮⸮⸮7⸮⸮⸮⸮j⸮⸮{⸮⸮⸮v⸮]#

So I would’t say it’s pointless ;p

It works but I have NO IDEA why !!!

Can someone explain this to me ?
what is a “char*”?
I know * is often used as a pointer, something I never really understood… But it is also a multiplication mark, so maybe it’s something else as well.

So if someone understand and can explain… I hate when i do working stuff I don’t understand… maybe even more than when I fail …

Argh, so frustrating !

So I would't say it's pointless ;p

OK. Pointless may be the wrong term. Let me clarify. It is WRONG. You can not meaningfully assign a string to a char variable.

Try

   char c = "Hello";
   Serial.print(c);

and see what happens.

I know * is often used as a pointer, something I never really understood..

The only other use it has is as the multiplication operator, which clearly is not the case here.

You will, at some point, need to understand pointers (and pointers to pointers and arrays of pointers).

A char can hold a value between -128 and 127. Why char is signed, when there are no negative values in the ASCII table is a real mystery.

An unsigned char can hold values in the range 0 to 255. So, again, I have to ask what the ASCII value of the character you are trying to print is. Is it in the range 0 to 255?

@OP
(1) Computer, Serial Monitor, and Programming Language are rule-bound devices. They deliver exact responses to those excitation for which they are designed.

(2) Serial Monitor is an ASCII device. This means that the device (the Serial Monitor) will show a character if we submit to it the ASCII code of that character. For example: If we wish to see the character A on the Serial Monitor, we must submit the corresponding ASCII codes which is 0x41 (65). The following Table contains the ASCII codes for the printable charcaters of the English Language Alphabet.

(3) You have wished to print this ᛞ symbol on the Serial Monitor. It is a graphic symbol; it is not found in the Table of Step-2, and yet you managed to show/print it on the Serial Monitor.

(4) char x = 'ᛞ'; and char y ="ᛞ"; are equivalent to me on paper. Are they same to the compiler? During processing, the compiler will try to assign the ASCII code of ᛞ to the variable x and to the variable y[0]; but, there is no such ᛞ symbol in the Table of Step-2. From where, the compiler is assigning values (as I have seen in my computer) -98 decimal and -31 decimal to the variable x and y[0] respectively?

(5) @PaulS has said pointless by which I mean meaningless; but, you have taken it as pointer less, and then you have discovered the code: char *x ="ᛞ";. It worked for you, and then you have wanted to know about pointer/pointer variable. The equivalent code for the previous one is char x = "ᛞ";. However, the former style of declaration has some advantages over the later style particularly in the context of saving strings of variable length.

GolamMostafa:
char *x ="ᛞ";. It worked for you, and then you have wanted to know about pointer/pointer variable.
The equivalent code for the previous one is char x = "ᛞ";.

No, it is not equivalent, it only prints the same.

char *x = "ᛞ";
char y[] = "ᛞ";

void setup() {
  Serial.begin(250000);
  Serial.print(F("sizeof(x) = "));
  Serial.println(sizeof(x));
  Serial.print(F("sizeof(y) = "));
  Serial.println(sizeof(y));
}
void loop() {}
sizeof(x) = 2
sizeof(y) = 4

Whandall:
No, it is not equivalent, it only prints the same.

sizeof(x) = 2
sizeof(y) = 4

Interesting and thoughtful for me! I have been able to know it by virtue of my hesitated courage that has prompted me to post my understanding*** which eventually proved to be 'not correct' by the good teacher; else, I would remain ever ignorant of it. K+.

***I have been influenced to utter the equivalency by the following excerpt taken from: Programming with C by: Byron Gottfried, Third Edition, Reprint-2012, Page-11.18.

"Suppose x is a one-dimensional, 10-element array of integers. It is possible to define x as a pointer variable rather than an array. Thus we can write

int *x;

rather than

int x[10];."

GolamMostafa:
***I have been influenced to utter the equivalency by the following excerpt taken from: Programming with C by: Byron Gottfried, Third Edition, Reprint-2012, Page-11.18.

"Suppose x is a one-dimensional, 10-element array of integers. It is possible to define x as a pointer variable rather than an array. Thus we can write

int *x;

rather than

int x[10];."

Which probably refers to parameters, which are passed as pointers anyway.
You have to give more context.

Added:

Although the name of an array evaluates to a pointer to its first element,
a pointer and an array are still two different things.

@GolamMostafa

To give you some insight in the inner workings

const char *x = "ᛞ";
const char y[] = "ᛞ";

void setup() {
  Serial.begin(250000);
  Serial.print(F("sizeof(x) = "));
  Serial.print(sizeof(x));
  Serial.print(F(", strlen(x) = "));
  Serial.print(strlen(x));
  Serial.print(F(", dump of x = "));
  dump(&x, sizeof(x));
  Serial.print(F("         dump of memory pointed to by x = "));
  dump(x, strlen(x) + 1);
  Serial.print(F("sizeof(y) = "));
  Serial.print(sizeof(y));
  Serial.print(F(", strlen(y) = "));
  Serial.print(strlen(y));
  Serial.print(F(", dump of y = "));
  dump(&y, sizeof(y));
}
void loop() {}

void dump(const void* adr, int len) {
  byte* ptr = (byte*) adr;
  byte idx;
  if (len) {
    for (; len > 0; len -= 16, ptr += 16) {
      phByte(((uint16_t)adr) >> 8);
      phByte(((uint16_t)adr) & 0xFF);
      Serial.print(F(": "));
      for (idx = 0; idx < 16 && idx < len; idx++) {
        phByte(ptr[idx]);
        Serial.write(' ');
      }
      Serial.write('\'');
      for (idx = 0; (idx < 16) && (idx < len); idx++) {
        byte curr = ptr[idx];
        Serial.write(curr < 0x20 ? '.' : curr);
      }
      Serial.write('\'');
      Serial.println();
    }
  }
}

void phByte(byte byt) {
  if (byt < 16) {
    Serial.write('0');
  }
  Serial.print(byt, HEX);
}
sizeof(x) = 2, strlen(x) = 3, dump of x = 0204: 1B 02 '..'
         dump of memory pointed to by x = 021B: E1 9B 9E 00 'ᛞ.'
sizeof(y) = 4, strlen(y) = 3, dump of y = 0200: E1 9B 9E 00 'ᛞ.'

The character collapse even if dumping is kind of strange.

whoa! lots of answers ! thanks guys !

first I have to say that I never learned coding before a year ago and I skipped some of what seemed to me hardcore complicated stuff, especially pointers. I know I need to learn it at some point but I'll do when I feel a bit more confident with more basic stuff. If anyone has a newbie friendly site/book to advice me, that would be gret!

And english isn't my language, so some of the complex explaination (more or less always in nerdy english) are a bit tough for me.

So I'm using the famous method of "Trial and error" and though it takes time it quite works...

@ GolamMostafa

  1. (probably newbie question) what does @OP means? I've seen sometimes but it doesn't mean anything to me (as did lmfa, idk, afk, imho, fyi... foreign acronyms are SO annoying ^^!)

  2. the symbols I used are not ascii, but unicode characters. I tried to copy/paste them directly from wikipedia and it works. (here)
    the unicode for it is (as an example) : ᛞ U+16DE (#5854:wink:
    But as far as I know, arduino doesn't simply use unicode. There must be something somewhere in the compiler that does it for you.

4)char y ="ᛞ" works fine, as well as char* y ="ᛞ". Thanks for the tip. But since I'm using a 2d array to store my values already, I think i'll keep using char* to avoid a 3d array. Clearer, simpler, lighter, unless there is more to know about it.

  1. "you have taken it as pointer less" I didn't understand what you meant by "pointer less". Is it a type of pointer that i'll learn when I'll learn pointers?

@ Whandall

So the char* is lighter than the char, right? (2bytes vs 4?) I should use it then....

then you lost me ...
i don't understand your code (probably due to my lack of pointers knowledge again)

I didn't know you could put nothing in the initialization of a for statement
for (; len > 0; len -= 16, ptr += 16) {

so E1 9B 9E 00 is what is stored when I use ᛞ ? Is that it ?

Thanks again for your answers and your patience towards a newbie !

djapipol:
So the char* is lighter than the char, right? (2bytes vs 4?) I should use it then....

It's a different thing, one is a pointer to an array of chars, the other one is the array of chars itself.

djapipol:
I didn't know you could put nothing in the initialization of a for statement
for (; len > 0; len -= 16, ptr += 16) {

Yes, any of the clauses can be empty.

for (;;) { // is the same as
}
while (true) {
}

djapipol:
so E1 9B 9E 00 is what is stored when I use ᛞ ? Is that it ?

Exactly. A three byte (UTF-8?) representation of that character/symbol.

djapipol:
So the char* is lighter than the char, right? (2bytes vs 4?) I should use it then…

The pointer is two bytes long, but the data it points to is still four bytes long. The value is copied from FLASH to RAM so the pointer takes six bytes of RAM and four bytes of FLASH. The array takes four bytes of each.
The lowest memory impact is “Serial.println(F(“ᛞ”));” in which case your character uses four bytes of FLASH memory and no RAM memory.

johnwasser:
The pointer is two bytes long, but the data it points to is still four bytes long.

(1) Through this declaration: int *p;, we understand that p is a pointer variable which points to an array of data whose each member is 16-bit wide.

(2) Through this declaration: long *p;, we understand that p is a pointer variable which points to an array of data whose each member is 32-bit (4-byte) wide.

(3) Through this declaration: char *p;, we understand that p is a pointer variable which points to an array of data whose each member is 8-bit wide (the value is confined with permissible ASCII Codes).

(4) Through this definition: char *p = “abcd”;, we understand that p is a pointer variable which points to a 4-byte long array of data whose each member is 8-bit wide (the value is confined with permissible ASCII Codes).

(5) Now, through this declaration/definition: char *p = “ᛞ”;, we should understand that p is a pointer variable which points to a 4-byte long array of data whose member is 4-byte wide. Why? Is it due to the fact that the symbol ᛞ is a member of UCS Set? The character a is also a member of UCS Set, then can we say that p is pointing to a data of 4-byte wide (0x00000041)? If yes, is it not apparently making a mismatch with the statement of Step-4?

Through this declaration: int *p;, we understand that p is a pointer variable which points to an array of data whose each member is 16-bit wide

No.

We understand that p is a pointer variable that points to a variable of type "int".

The variable may or may not be an array.
The pointer may or may not be valid.

GolamMostafa:
(3) Through this declaration: char *p;, we understand that p is a pointer variable which points to an array of data whose each member is 8-bit wide (the value is confined with permissible ASCII Codes).

The value is not confined to 7-bit ASCII. The value can be any signed 8-bit value (-128 to +127).

GolamMostafa:
(4) Through this definition: char *p = “abcd”;, we understand that p is a pointer variable which points to a 4-byte long array of data whose each member is 8-bit wide (the value is confined with permissible ASCII Codes).

The pointer is initialized to point to the first character of a five byte array of bytes. The first four bytes are initialized with the four ASCII characters and the value of the fifth byte is zero (the null terminator).

GolamMostafa:
(5) Now, through this declaration/definition: char *p = “ᛞ”;, we should understand that p is a pointer variable which points to a 4-byte long array of data whose member is 4-byte wide. Why? Is it due to the fact that the symbol ᛞ is a member of UCS Set? The character a is also a member of UCS Set, then can we say that p is pointing to a data of 4-byte wide (0x00000041)? If yes, is it not apparently making a mismatch with the statement of Step-4?

The pointer is initialized to point to the first character of a four byte array. The first three bytes are as-initialized to the three-byte UTF-8 encoding of the Unicode character and the value of the fourth byte is zero (the null terminator).

@johnwasser

Jolly! K+.