Northants - UK
Offline
Full Member
Karma: 0
Posts: 109
|
 |
« on: May 22, 2012, 04:28:30 pm » |
I have just been reading about the character variable. Now reading between the lines this was a roundabout way of saying that it is a one byte variable that contains the ASCII codes of what ever character is put into it. So how can this variable have negative and positive numbers? This does not make sense the ASCII code goes from 0 to 255 why do you even need negative numbers in this bearable type?
|
|
|
|
|
Logged
|
|
|
|
|
Sydney
Offline
God Member
Karma: 14
Posts: 717
Big things come in large packages
|
 |
« Reply #1 on: May 22, 2012, 04:36:36 pm » |
The char is equivalent to an unsigned byte or uint8_t. They take up the same amount of space (8 bits). Whether something is negative or positive depends on the interpretation of the bits, so 0xff can be 255 or -1. Look up 'two's complement' to find out how this works.
|
|
|
|
|
Logged
|
|
|
|
|
California
Offline
Edison Member
Karma: 38
Posts: 1849
|
 |
« Reply #2 on: May 22, 2012, 04:37:20 pm » |
I have just been reading about the character variable. Now reading between the lines this was a roundabout way of saying that it is a one byte variable that contains the ASCII codes of what ever character is put into it. So how can this variable have negative and positive numbers? This does not make sense the ASCII code goes from 0 to 255 why do you even need negative numbers in this bearable type?
Just depends on the context. Unsigned and signed variables are just interpreted differently, but they are stored the same in memory.
|
|
|
|
|
Logged
|
|
|
|
|
Global Moderator
Melbourne, Australia
Offline
Shannon Member
Karma: 218
Posts: 13896
Lua rocks!
|
 |
« Reply #3 on: May 22, 2012, 04:50:33 pm » |
The char is equivalent to an unsigned byte or uint8_t.
It's actually a signed byte. It is "byte" that is unsigned. This does not make sense the ASCII code goes from 0 to 255 why do you even need negative numbers in this bearable type? It's a bit of a historical throwback, I think. It doesn't make sense to have "negative" letters, so really the byte (unsigned char) would have been a better choice. But as the others said, when stuffing ASCII codes into a byte (or char) you don't care about the sign because you won't be doing arithmetic on it.
|
|
|
|
|
Logged
|
|
|
|
|
Sydney
Offline
God Member
Karma: 14
Posts: 717
Big things come in large packages
|
 |
« Reply #4 on: May 22, 2012, 08:05:34 pm » |
Oops!
Actually a related question from me is how Arduino would support double byte or Unicode characters now that the interface supports multiple languages, or is it just not an issue in this environment?
|
|
|
|
« Last Edit: May 22, 2012, 08:07:39 pm by marco_c »
|
Logged
|
|
|
|
|
Global Moderator
Melbourne, Australia
Offline
Shannon Member
Karma: 218
Posts: 13896
Lua rocks!
|
 |
« Reply #5 on: May 22, 2012, 08:12:10 pm » |
I suppose you could put Unicode into an int or long, not sure why you would want to. You would need a suitable display device for there to be much point. Probably I would use UTF-8 if I had to support Unicode - bearing in mind how short we are of RAM.
|
|
|
|
|
Logged
|
|
|
|
|
Sydney
Offline
God Member
Karma: 14
Posts: 717
Big things come in large packages
|
 |
« Reply #6 on: May 22, 2012, 08:15:15 pm » |
My thinking was around serial characters to the console display in the IDE.
LCD displays would have special characters anyway for non-roman characters. Haven't used a screen 'display' as such, so don't know what drives those.
|
|
|
|
|
Logged
|
|
|
|
|
Grand Blanc, MI, USA
Offline
Edison Member
Karma: 43
Posts: 2491
"We're a proud service of the Lost Electricity Reclamation Agency"
|
 |
« Reply #7 on: May 22, 2012, 08:21:23 pm » |
One reason a negative character is desirable is to indicate some condition, error, etc. For example, see the Serial.read() function. When called, it reads the next character available. But what if there are no characters to read? Then it returns -1. Actually char variables can be declared either as signed or unsigned. In reality they are just a short int. Whether signed or unsigned is the default, and how long a "short" is, is installation dependent. //declare some chars unsigned char a; signed char b;
|
|
|
|
|
Logged
|
|
|
|
|
0
Offline
Tesla Member
Karma: 71
Posts: 6603
Arduino rocks
|
 |
« Reply #8 on: May 22, 2012, 08:26:11 pm » |
I suppose you could put Unicode into an int or long, not sure why you would want to. You would need a suitable display device for there to be much point. Probably I would use UTF-8 if I had to support Unicode - bearing in mind how short we are of RAM.
If you were using Kanji that last statement wouldn't make sense - UTF-8 is less efficient than UTF-16 or other 16 bit encodings.
|
|
|
|
|
Logged
|
|
|
|
|
Global Moderator
Melbourne, Australia
Offline
Shannon Member
Karma: 218
Posts: 13896
Lua rocks!
|
 |
« Reply #9 on: May 23, 2012, 01:19:33 am » |
Ah well, horses for courses. I am not in fact using Kanji.
|
|
|
|
|
Logged
|
|
|
|
|
Global Moderator
UK
Online
Brattain Member
Karma: 137
Posts: 19016
I don't think you connected the grounds, Dave.
|
 |
« Reply #10 on: May 23, 2012, 02:02:16 am » |
Actually char variables can be declared either as signed or unsigned. In reality they are just a short int First part true - in fact, you can tell the compiler whether you want an unqualified "char" to be signed or unsigned. Second part, false - a "short int" is not the same as "char" on the Arduino. One reason a negative character is desirable is to indicate some condition, error, etc. For example, see the Serial.read() function. The return type of "Serial.read" is "int", which is how it deals with returning -1. Any characters received with the sign bit set (0x80 to 0xFF) are not sign-extended, so are returned as "int"s in the range 0x0080 to 0x00FF.
|
|
|
|
« Last Edit: May 23, 2012, 03:31:04 am by AWOL »
|
Logged
|
Pete, it's a fool looks for logic in the chambers of the human heart.
|
|
|
|
Germany
Offline
Edison Member
Karma: 27
Posts: 1487
|
 |
« Reply #11 on: May 23, 2012, 05:06:46 am » |
tell the compiler whether you want an unqualified "char" to be signed or unsigned My only try ( defining a " signed byte variable" ) failed, so I imagined there is no "signed" qualifier in Arduino ( or avr-gcc ), and char, int and long are signed by default, whether it makes sense for a character in a char type variable or not. I understand Serial Monitor is a java application ( where chars are 16 bit unicode by default ), so one would have to check carefully (in both directions and eventually consider OS dependencies) how it behaves with non-ASCII characters. There's no such thing as code pages on Arduino (defining the meaning of a character). It is even rather common convenience to rely on the assumption that a char in the range 1...127 represents an ASCII character.
|
|
|
|
|
Logged
|
|
|
|
|
Global Moderator
UK
Online
Brattain Member
Karma: 137
Posts: 19016
I don't think you connected the grounds, Dave.
|
 |
« Reply #12 on: May 23, 2012, 05:40:27 am » |
My only try ( defining a "signed byte variable" ) failed, Not surprising, really, because typedef uint8_t byte; However, nothing at all wrong with signed char variable;
|
|
|
|
|
Logged
|
Pete, it's a fool looks for logic in the chambers of the human heart.
|
|
|
|
Global Moderator
Melbourne, Australia
Offline
Shannon Member
Karma: 218
Posts: 13896
Lua rocks!
|
 |
« Reply #13 on: May 23, 2012, 06:19:40 am » |
so I imagined there is no "signed" qualifier in Arduino ( or avr-gcc )
Try again: void setup () { signed int foo; signed long bar; signed char fubar; } void loop () {} Gives: Binary sketch size: 466 bytes (of a 32256 byte maximum)
|
|
|
|
|
Logged
|
|
|
|
|
Grand Blanc, MI, USA
Offline
Edison Member
Karma: 43
Posts: 2491
"We're a proud service of the Lost Electricity Reclamation Agency"
|
 |
« Reply #14 on: May 23, 2012, 08:18:13 am » |
Second part, false - a "short int" is not the same as "char" on the Arduino.
Thanks for that, not sure why I thought that. I was actually only referring to number of bits, but was still wrong, short ints and ints are both 16 bits. This is of course consistent with the standard which I believe says that an int only needs to have at least as many bits as a short int. The return type of "Serial.read" is "int", which is how it deals with returning -1. Any characters received with the sign bit set (0x80 to 0xFF) are not sign-extended, so are returned as "int"s in the range 0x0080 to 0x00FF.
That is as I understood. I wasn't thinking specifically Arduino though, more back in the day when ASCII only had 128 characters and Pluto was still a planet ;-)
|
|
|
|
|
Logged
|
|
|
|
|
|