Pages: [1]   Go Down
Author Topic: The char variable  (Read 1450 times)
0 Members and 1 Guest are viewing this topic.
Northants - UK
Offline Offline
Sr. Member
****
Karma: 1
Posts: 251
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

I have just been reading about the character variable. Now reading between the lines this was a roundabout way of saying that it is a one byte variable that contains the ASCII codes of what ever character is put into it. So how can this variable have negative and positive numbers? This does not make sense the ASCII code goes from 0 to 255 why do you even need negative numbers in this bearable type?
Logged


Sydney, Australia
Offline Offline
Edison Member
*
Karma: 33
Posts: 1249
Big things come in large packages
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

The char is equivalent to an unsigned byte or uint8_t. They take up the same amount of space (8 bits). Whether something is negative or positive depends on the interpretation of the bits, so 0xff can be 255 or -1. Look up 'two's complement' to find out how this works.
Logged

Arduino libraries http://arduinocode.codeplex.com
Parola hardware & library http://parola.codeplex.com

California
Offline Offline
Faraday Member
**
Karma: 88
Posts: 3350
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

I have just been reading about the character variable. Now reading between the lines this was a roundabout way of saying that it is a one byte variable that contains the ASCII codes of what ever character is put into it. So how can this variable have negative and positive numbers? This does not make sense the ASCII code goes from 0 to 255 why do you even need negative numbers in this bearable type?
Just depends on the context. Unsigned and signed variables are just interpreted differently, but they are stored the same in memory.
Logged

Global Moderator
Offline Offline
Brattain Member
*****
Karma: 473
Posts: 18695
Lua rocks!
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

The char is equivalent to an unsigned byte or uint8_t.

It's actually a signed byte. It is "byte" that is unsigned.

Quote
This does not make sense the ASCII code goes from 0 to 255 why do you even need negative numbers in this bearable type?

It's a bit of a historical throwback, I think. It doesn't make sense to have "negative" letters, so really the byte (unsigned char) would have been a better choice. But as the others said, when stuffing ASCII codes into a byte (or char) you don't care about the sign because you won't be doing arithmetic on it.
Logged

Sydney, Australia
Offline Offline
Edison Member
*
Karma: 33
Posts: 1249
Big things come in large packages
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Oops!

Actually a related question from me is how Arduino would support double byte or Unicode characters now that the interface supports multiple languages, or is it just not an issue in this environment?
« Last Edit: May 22, 2012, 08:07:39 pm by marco_c » Logged

Arduino libraries http://arduinocode.codeplex.com
Parola hardware & library http://parola.codeplex.com

Global Moderator
Offline Offline
Brattain Member
*****
Karma: 473
Posts: 18695
Lua rocks!
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

I suppose you could put Unicode into an int or long, not sure why you would want to. You would need a suitable display device for there to be much point. Probably I would use UTF-8 if I had to support Unicode - bearing in mind how short we are of RAM.
Logged

Sydney, Australia
Offline Offline
Edison Member
*
Karma: 33
Posts: 1249
Big things come in large packages
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

My thinking was around serial characters to the console display in the IDE.

LCD displays would have special characters anyway for non-roman characters. Haven't used a screen 'display' as such, so don't know what drives those.
Logged

Arduino libraries http://arduinocode.codeplex.com
Parola hardware & library http://parola.codeplex.com

Grand Blanc, MI, USA
Offline Offline
Faraday Member
**
Karma: 92
Posts: 3940
CODE is a mass noun and should not be used in the plural or with an indefinite article.
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

One reason a negative character is desirable is to indicate some condition, error, etc. For example, see the Serial.read() function. When called, it reads the next character available. But what if there are no characters to read? Then it returns -1.

Actually char variables can be declared either as signed or unsigned. In reality they are just a short int. Whether signed or unsigned is the default, and how long a "short" is, is installation dependent.

Code:
//declare some chars
unsigned char a;
signed char b;
Logged

MCP79411/12 RTC ... "One Million Ohms" ATtiny kit ... available at http://www.tindie.com/stores/JChristensen/

0
Offline Offline
Shannon Member
****
Karma: 199
Posts: 11639
Arduino rocks
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

I suppose you could put Unicode into an int or long, not sure why you would want to. You would need a suitable display device for there to be much point. Probably I would use UTF-8 if I had to support Unicode - bearing in mind how short we are of RAM.

If you were using Kanji that last statement wouldn't make sense - UTF-8 is less efficient than UTF-16 or other 16 bit encodings.
Logged

[ I won't respond to messages, use the forum please ]

Global Moderator
Offline Offline
Brattain Member
*****
Karma: 473
Posts: 18695
Lua rocks!
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Ah well, horses for courses. I am not in fact using Kanji.
Logged

Global Moderator
UK
Offline Offline
Brattain Member
*****
Karma: 285
Posts: 25630
I don't think you connected the grounds, Dave.
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
Actually char variables can be declared either as signed or unsigned. In reality they are just a short int
First part true - in fact, you can tell the compiler whether you want an unqualified "char" to be signed or unsigned.
Second part, false - a "short int" is not the same as "char" on the Arduino.

Quote
One reason a negative character is desirable is to indicate some condition, error, etc. For example, see the Serial.read() function.
The return type of "Serial.read" is "int", which is how it deals with returning -1.
Any characters received with the sign bit set (0x80 to 0xFF) are not sign-extended, so are returned as "int"s in the range 0x0080 to 0x00FF.
« Last Edit: May 23, 2012, 03:31:04 am by AWOL » Logged

"Pete, it's a fool looks for logic in the chambers of the human heart." Ulysses Everett McGill.
Do not send technical questions via personal messaging - they will be ignored.

Germany
Offline Offline
Faraday Member
**
Karma: 56
Posts: 2973
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
tell the compiler whether you want an unqualified "char" to be signed or unsigned
My only try ( defining a "signed byte variable" ) failed,
so I imagined there is no "signed" qualifier in Arduino ( or avr-gcc ), and char, int and long are signed by default, whether it makes sense for a character in a char type variable or not.

I understand Serial Monitor is a java application ( where chars are 16 bit unicode by default ), so one would have to check carefully (in both directions and eventually consider OS dependencies) how it behaves with non-ASCII characters.
There's no such thing as code pages on Arduino (defining the meaning of a character).
It is even rather common convenience to rely on the assumption that a char in the range 1...127 represents an ASCII character.
Logged

Global Moderator
UK
Offline Offline
Brattain Member
*****
Karma: 285
Posts: 25630
I don't think you connected the grounds, Dave.
View Profile
 Bigger Bigger  Smaller Smaller  Reset Reset

Quote
My only try ( defining a "signed byte variable" ) failed,
Not surprising, really, because
Code:
typedef uint8_t byte;

However, nothing at all wrong with
Code:
signed char variable;
Logged

"Pete, it's a fool looks for logic in the chambers of the human heart." Ulysses Everett McGill.
Do not send technical questions via personal messaging - they will be ignored.

Global Moderator
Offline Offline
Brattain Member
*****
Karma: 473
Posts: 18695
Lua rocks!
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

so I imagined there is no "signed" qualifier in Arduino ( or avr-gcc )

Try again:

Code:
void setup ()
  {
   signed int foo;
   signed long bar;
   signed char fubar;
  }
 void loop () {}
 

Gives:

Code:
Binary sketch size: 466 bytes (of a 32256 byte maximum)
Logged

Grand Blanc, MI, USA
Offline Offline
Faraday Member
**
Karma: 92
Posts: 3940
CODE is a mass noun and should not be used in the plural or with an indefinite article.
View Profile
WWW
 Bigger Bigger  Smaller Smaller  Reset Reset

Second part, false - a "short int" is not the same as "char" on the Arduino.

Thanks for that, not sure why I thought that. I was actually only referring to number of bits, but was still wrong, short ints and ints are both 16 bits. This is of course consistent with the standard which I believe says that an int only needs to have at least as many bits as a short int.

Quote
The return type of "Serial.read" is "int", which is how it deals with returning -1.
Any characters received with the sign bit set (0x80 to 0xFF) are not sign-extended, so are returned as "int"s in the range 0x0080 to 0x00FF.

That is as I understood. I wasn't thinking specifically Arduino though, more back in the day when ASCII only had 128 characters and Pluto was still a planet ;-)
Logged

MCP79411/12 RTC ... "One Million Ohms" ATtiny kit ... available at http://www.tindie.com/stores/JChristensen/

Pages: [1]   Go Up
Jump to: