Go Down

Topic: "Signed byte" data type? (Read 15948 times) previous topic - next topic

tim7

Which data type do you recommend to store signed 8-bit values?  Right now I'm using chars, but I have to remember to do a cast whenever passing this variable type to Serial.print, and I think C does not guarantee that the char type is signed.

AWOL

Quote
and I think C does not guarantee that the char type is signed.

For a given compilation it does guarantee that the type is signed.

ahelion

Hi,
   In C, u have signed char == char, only if you want to be unsigned do you have to prefix it wit unsigned.

AWOL

Quote
In C, u have signed char == char, only if you want to be unsigned do you have to prefix it wit unsigned.

That is not correct - you can instruct the compiler to treat all "char" as unsigned if you wish.
Arduino does not do this.

tim7

K&R say, "Whether plain chars are signed or unsigned is machine-dependent."  Would an int8_t be what I need?  I'm not terribly familiar with fixed bit-length data types, and find myself needing to save ram.

Looking through the Arduino and AVR libraries I couldn't find any type-definitions except for byte, boolean and word.  Does anyone know where the standard data types are set?  

AWOL

Quote
Would an int8_t be what I need?

That would make it clear to everyone.

Coding Badly

Quote
Would an int8_t be what I need?


That's what I use.

nickgammon


Looking through the Arduino and AVR libraries I couldn't find any type-definitions except for byte, boolean and word.  Does anyone know where the standard data types are set?  


stdint.h

Inside that:

Code: [Select]
/** \ingroup avr_stdint
    8-bit signed type. */

typedef signed char int8_t;

/** \ingroup avr_stdint
    8-bit unsigned type. */

typedef unsigned char uint8_t;
Please post technical questions on the forum, not by personal message. Thanks!

More info: http://www.gammon.com.au/electronics

bperrybap

The c standard recognizes 3 distinct "char" types and they are treated differently.
char
unsigned char
signed char

char will always return a 1 for sizeof(char) but it isn't always limited to 8 bits.
(normally larger than 8 bit values are for other character sets)

I recently found a case where the value of char is allowed to exceed 0xff.

For example if yo do something like:

Code: [Select]
void bar(int x);
void foo(void)
char c = 0;
    while(1)
    {
        c++;
        bar(c);
    }
}


and turn on the optimizer,
bar will receive values starting at 0 marching up to 0xff and then keep going
0x100, ....... 0x8000, all the way up to 0xffff.

But yet if you were to use
unsigned char or signed char or tell the compiler that chars are unsigned by using -funsigned-char
or use c +=1 instead of c++
or make c a global variable, it will wrap at 0xff as is expected.

I just had a big argument about this with the guys over on avrfreaks and
since the C standard says that "results on overflow are undefined" it is not
considered a bug.

What's really going on is the optimizer is seeing the ++ operator
and because it has to promote the 8 bit value to 16 bits for the increment
and also has to promote the 8 bit value to 16 bits for the bar() function, it
assumes that it can just leave it as 16 bits after the increment and pass it
directly to bar().

So this shows that the safest thing to do is to always use
uint8_t or int8_t when you want small numbers and only use char for true characters
because chars can be treated differently than both a signed char and a unsigned char
because they are a separate type to the compiler.

--- bill

nickgammon


So this shows that the safest thing to do is to always use
uint8_t or int8_t when you want small numbers and only use char for true characters
because chars can be treated differently than both a signed char and a unsigned char
because they are a separate type to the compiler.


Well, that's bizarre!

I am gobsmacked that the compiler lets an 8-bit variable exceed 255. However your assertion that:

Quote
... the safest thing to do is to always use uint8_t or int8_t ...


isn't correct.

Consider this sketch:

Code: [Select]
void bar(int x)
  {
  Serial.println (x, DEC);
  }
 
void foo(void)
{
int8_t c = 0;
    while(1)
    {
        c++;
        bar(c);
    }
}

void setup ()
{
  Serial.begin (115200);
  foo ();
}

void loop () {}


That uses int8_t. And its output, in part, is:

Code: [Select]

233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274


So basically "char" and "signed char" are the same (as I would have thought they are). And this compiler, ah, "feature" applies to both types.

Let's get real. It's a bug. Changing optimization levels shouldn't change behaviour. Not on something fundamental like adding 1 to a variable.

Quote
... it has to promote the 8 bit value to 16 bits for the increment ...


No, it doesn't. Any more than I have to convert an 8 bit field to a 16-bit field to add 1 to it in assembler. Or promote an int to a long. Or a long to a long long.
Please post technical questions on the forum, not by personal message. Thanks!

More info: http://www.gammon.com.au/electronics

bperrybap

#10
Sep 23, 2011, 11:32 am Last Edit: Sep 23, 2011, 11:34 am by bperrybap Reason: 1
I agree I believe it is a case of over aggressive optimization.
You can change the variable form an automatic to a static and it behaves "normally" by
treating it as an 8 bit value again.
The guys on avrfreaks just kept hiding behind the C standards very explicit example of "undefined behavior"
being that of integer overflow.
(unsigned values are required to wrap silently)

Feel free to join in the discussion over on avrfreaks:
http://www.avrfreaks.net/index.php?name=PNphpBB2&file=viewtopic&t=111837

As far as 8 bits promoted to 16 bits, yes that is a requirement. The requirement is that
when a mathematical operation/expression is to be performed/evaluated the type must be change to a "int" to perform
the operation. The AVR gcc implemenation uses 16 bits for an int so anytime there is a mathematical
operation a promotion to int happens. Just like when calling a function. The value is promoted to an int.
It is then converted back when put into the object.
This is an odd case where the object is allowed to overflow its defined size because the
resulting value is never assigned back to the object.

BTW, this happens on the intel gcc as well. So it is not unique to AVR gcc.

What strikes me as really odd is that ++ is supposed to be the same as += 1
yet it isn't for signed char objects (object that are smaller than an int)
But then there are very explicit rules for how
to do assignments, that I'm guessing got overlooked with doing the ++ operator on chars.

What will make you head hurt even more is that when dealing with other character sets
a char can be wider than 8 bits but by a standard requirement must always return 1 for sizeof(char)

One thing that the arduino guys could do is use -funsigned-char and then these strange overflow
"errors" would go away for at least char types.

--- bill


nickgammon

Code: [Select]
char c = 0; 
...
c++;  // point A

bar(c);  // point B


According to limits.h:

Code: [Select]

#define UCHAR_MAX 255 /* max value for an unsigned char */
#define CHAR_MAX 127 /* max value for a char */
#define CHAR_MIN (-128) /* min value for a char */


Thus at the conclusion of the c++ line (point A) the contents of C (being a signed char) must be in the range -128 to +127. Whether or not overflow on addition is defined, the contents of c must be in that range. An 8-bit variable can only contain those values, it doesn't have the capability to contain NaN or Inf or anything else.

Now when you call "bar(c)" bar therefore must be passed a number in the range -128 to +127. I can't agree that "the optimizer" has the option to pass bar other values.

Having said all that however, I note this:

http://en.wikipedia.org/wiki/Undefined_behavior

And from that page:

Quote
This specifically frees the compiler to do whatever is easiest or most efficient, should such a program be submitted. In general, any behavior afterwards is also undefined. In particular, it is never required that the compiler diagnose undefined behavior -- therefore, programs invoking undefined behavior may appear to compile and even run without errors at first, only to fail on another system, or even on another date. When an instance of undefined behavior occurs, so far as the language specification is concerned anything could happen, maybe nothing at all.


So I suppose that the compiler-writers have an "out" in that, according to this, once you have added 1 to a signed char variable, which currently contains 127, then the entire rest of the program's behaviour (the behavior afterwards) is now, acceptably, undefined.

I submit that this is surprising behaviour, and not really acceptable. For one thing, on a small platform like an Arduino, I would expect that adding 1 to a char variable would involve the compiler generating byte-addition, and not promote the addition to 2-byte addition, which is somewhat more expensive of both time and program space.

Quote
The guys on avrfreaks just kept hiding behind the C standards very explicit example of "undefined behavior" being that of integer overflow.


It would seem that technically they are right. However I can't help thinking that if you were flying in a Jumbo Jet, and if in some small subsystem of the plane's avionics, an integer overflowed, it wouldn't be acceptable for all subsequent behaviour of that aeroplane to be undefined.

Quote
The AVR gcc implemenation uses 16 bits for an int so anytime there is a mathematical operation a promotion to int happens. Just like when calling a function.


In this particular case, the function is being passed a char. There is no basis for a promotion here. Sure, the function receives an int, but the variable being passed is a char.
Please post technical questions on the forum, not by personal message. Thanks!

More info: http://www.gammon.com.au/electronics

tim7

Hmm, good to know using ++ vs +=1 affects overflow behaviour.  That would make for some awkward bug-hunting.

Go Up