char[] is initialized as long[]

A project I’ve been working on using the Mega 2560 and the SCP1000 SPI sensor has presented an unusual problem.

A portion of the code, and then explanation:

void SPI_Interface::write( char *p_Data, int size )
{
	char *i_Data;
	i_Data = p_Data;
	for( int i = 0; i < size; ++i)
	{
		Serial.print(i); //Debug
 		Serial.print(">"); //Debug
 		Serial.print(*i_Data, HEX); //Debug
 		Serial.print("|"); //Debug
		*i_Data = SPI.transfer(*i_Data);
		++i_Data;
	}
}

I’m currently using a pointer for iteration, to try and eliminate the array as the issue. Very simply, each char value in the array is being cast as a long on the Mega 2560. Simply creating the array and then outputting it over Serial.print() produces the same problem, so it appears that when the char value is set, it is immediately a 32bit value where the leading bit is used to fill in the remaining 24bits. I have yet to confirm whether this is the case with unsigned char or byte, but since the leading bit differs between unsigned and signed(it
s 0), I believe the behavior is identical(Serial.print() does not print leading 0’s). Has anyone else experienced this issue? Is it present only with the ATmega2560? Any possible solutions? I’ll be reconverting it back to an std::string from the AVR-STL library, but a char is greatly preferred for resource reasons.

Very simply, each char value in the array is being cast as a long on the Mega 2560.

Where? Why?

Simply creating the array and then outputting it over Serial.print() produces the same problem, so it appears that when the char value is set, it is immediately a 32bit value where the leading bit is used to fill in the remaining 24bits.

No. chars are not 32 bits. They are 8 bits. If you are expecting a char to be 32 bits, you are getting 24 bits of garbage from somewhere else.

Is it present only with the ATmega2560?

No. It is only present in your code.

Any possible solutions?

Proper coding generally works.

Try explaining what you think the problem is, and posting all of your code.

Where? Why?

That's why I'm here asking. I don't know.

No. chars are not 32 bits. They are 8 bits. If you are expecting a char to be 32 bits, you are getting 24 bits of garbage from somewhere else.

I am aware of the fact that a char is 8 bits. That's why the unexplained behavior is odd. It's not garbage. The value I'm working with extends beyond the boundries wrapping it to a negative. In binary:

|1|0|0|0|0|1|0|0|
 ^First bit for sign

When the ATmega2560 uP converts it to a DWord, it uses the MSB to fill in the remainder of the data creating:

|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|1|0|0|0|0|1|0|0|

This is the issue. Upon creating the char[], the value is returned on serial as a 32bit DWord with leading 1's filled in producing an incorrect value. When I'm trying to send this data over SPI to call read from the 0x21 register, it's receiving the wrong data, unless it's just serial that interpolates the data to long in which case it's a library issue because the UART communicates at 5-9 bits, in this case, 8.

No. It is only present in your code.

Sure, that may be the case, I will test a character array in a separate binary.

Try explaining what you think the problem is, and posting all of your code.

Is my new explanation better? And I am not posting all the code unless you want an extremely long post. If you really need it, I'll use pastebin.

And I am not posting all the code unless you want an extremely long post. If you really need it, I'll use pastebin.

Can you post the code as an attachment? Without it, we can't really help you.

Use "byte" not "char" - that's sign-extension.

Forget the previous code, here is the real issue:

Code:

char x;

void setup() {
  x = 0x80;
  Serial.begin(9600);

}

void loop() {
  Serial.print("Decimal: ");
  Serial.print(x);
  Serial.print(" | Hexidecimal: ");
  Serial.println(x, HEX);
}

Output:

Decimal: € | Hexidecimal: FFFFFF80
Decimal: € | Hexidecimal: FFFFFF80
Decimal: € | Hexidecimal: FFFFFF80
Decimal: € | Hexidecimal: FFFFFF80

When simply passing in x (decimal), the value is printed as an unprintable character (I assume in order to get it to actually print "-128" x would have to be an int, not a char...just the way the Serial library works).

What is interesting, however, is when printed in Hexidecimal, x (which was declared as a char) is interpreted as a 32bit signed number while it should be a signed 8bit number. Is this simply the Serial.print() function upcasting, or am I just completely missing something?

Does this print out what you would expect? :

unsigned char x;

void setup() {
  x = 0x80;
  Serial.begin(9600);

}

void loop() {
  Serial.print("Decimal: ");
  Serial.print(x, DEC);
  Serial.print(" | Hexidecimal: ");
  Serial.println(x, HEX);
}

Decimal: 128 | Hexidecimal: 80 Decimal: 128 | Hexidecimal: 80 Decimal: 128 | Hexidecimal: 80 Decimal: 128 | Hexidecimal: 80 Decimal: 128 | Hexidecimal: 80 Decimal: 128 | Hexidecimal: 80 Decimal: 128 | Hexidecimal: 80 Decimal: 128 | Hexidecimal: 80 Decimal: 128 | Hexidecimal: 80 Decimal: 128 | Hexidecimal: 80 Decimal: 128 | Hexidecimal: 80

Yes, it does print what I expect, but really all it did was hide the problem, not solve it. The point is that x is being printed as if it were a 32 bit number instead of an 8 bit number.

By declaring x as unsigned as you've done in your example, I do in fact see it printed as 0x80, but what I am not seeing are the leading 0's that are likely there. By leaving x as signed, it interprets 0x80, in decimal, as -128 (unsigned it is +128, of course). Thus, I can see that the print function is expanding it to a 32 bit number, because negative numbers have to have leading F's.

I guess what I've found here is that the print() function seems to automatically upcast 8 bit numbers to 32 bit numbers, which bothers me. The reason it bothers me is because I am trying to use SPI.transfer(), and I would like to see, byte by byte, what I am transferring. This upcasting makes it difficult to be sure what is going on when I am debugging my code. It also makes me wonder if the transfer() function does this upcasting too!

The print() method casts char type to long. The compiler is using sign-extension. There is a separate method signature for unsigned char. Look in \hardware\arduino\cores\arduino\Print.cpp

void Print::print(char c, int base)
{
  print((long) c, base);
}

void Print::print(unsigned char b, int base)
{
  print((unsigned long) b, base);
}

Thank you! That answers my question.

See reply #4 ("byte" == "unsigned char")

AWOL:
See reply #4

Reply 4 answers what is happening, and how to fix it, but now where the sign extend is happening.

I was curious. I had assumed that print() had a long and unsigned long signature only. I was surprised that it had all those other type signatures. Since all they do is cast to a long, it would be just as easy to let the compiler handle that on the calling side and save the bother of defining all the extra signatures. I’m not a C++ guy though. Maybe there’s an issue with type promotion in C++ that I don’t know about.

Personally I find the print()/println() stuff a failure. Gimme a working printf() and I will be happy. When I was debugging a rather nasty protocol on Arduino, I wound up having to actually debug most of it on a real computer and needed a compatible debug print function. I wound up doing something like:

#ifdef ARDUINO
#include <stdarg.h>
debug_P(const char *s, ...)
{
va_list ap;
char buf[40];
va_start(s, ap);
vsnprintf_P (buf, sizeof(buf), s, ap);
Serial.print(buf);
}
#endif

Worked well and was portable to my real computer via #define debug_P printf.

Since all they do is cast to a long

That's not all they do. (it may be all they do with a following ", HEX", but that's not the only case.) I don't know whether C++ will automatically promote types when using an overloaded method. That seems like it would be dangerous. And it would be unfriendly to require the Arduino users to cast the argument...