Error in String.getBytes() ?

Hello
I am a Java developer, so I can't brag about my C++ experience.

I am using Arduino IDE 1.0.1 for Windows

Example code:

void setup(){
  Serial.begin(9600);
  delay(100);
  Serial.println("Serial started");
}

void loop(){
  String message = "Hello World";
  byte bytes[message.length()];
  message.getBytes(bytes, message.length());
  Serial.print(message+" length="+message.length());
  Serial.print(" (");
  for (int i = 0; i < message.length(); i++){
    if (i > 0){
      Serial.print(" "); 
    }
    Serial.print(bytes[i],DEC);
  }
  Serial.println(")");
}

This gives the following output:
Hello World length=11 (72 101 108 108 111 32 87 111 114 108 0)

As you can see, the last byte is always 0

If I am doing something wrong, please let me know

Best regards
Per-Jarle

I have the same problem also with toCharArray() method

Best regards
Per-Jarle

http://arduino.cc/en/Reference/String

Edit: sorry I'm wrong No I wasn't :smiley:

So it should be:

byte bytes[message.length() + 1];
message.getBytes(bytes, message.length() + 1);

length()
Returns the length of the String, in characters. (Note that this doesn't include a trailing null character.)

Since the length excludes the terminating null byte, your array is one element too short. Since getBytes() is smart enough to truncate the string if necessary to fit in the buffer you supply, and to null-terminate it even if it was truncated, what you're seeing is the string minus the last character.

You can also do like this (don't know if it will be helpful to you):

String message = "Hello World";
Serial.print(message+" length="+message.length());
Serial.print(" (");
for (int i = 0; i < message.length(); i++){
  if (i > 0){
    Serial.print(" "); 
  }
  Serial.print(message[i],DEC);
}
Serial.println(")");

Same result, less functions call, less memory used... :slight_smile:

Maybe this changed.....but I have to send a String over an XBEE connection. The way it is send means I have to convert the String to uint8_t[]. I played around with the getBytes and arrays a bit, as I seemed to loose my last character.

String length = 43 characters (as in, counted on screen on println, so probably excluding the trailing zero)
Code:

uint8_t payload[message.length()]; 
message.getBytes(payload, message.length());

Result: last character lost

As suggested above:

uint8_t payload[message.length()+1]; 
message.getBytes(payload, message.length()+1);

Result: A blank space after my message, so no good

Used code:

uint8_t payload[message.length()]; 
message.getBytes(payload, message.length()+1);

Result: String is send, no errors/missing chars/blanks etc

So it seems length() does give the right size array, but somehow replaces the last char with the trailing zero in the getBytes() method???

Result: last character lost

Because your array is not large enough.

Result: A blank space after my message, so no good

Because you are not defining the correct length in the call to getBytes().

Result: String is send, no errors/missing chars/blanks etc

You've gotten lucky.

The array size needs to be based on length()+1. The value in the call to getBytes() needs to be length(). The getBytes function needs to know how many characters it can write to the array.

You really should not be defining the buffer this way, though. The buffer should be statically defined, with a set size. Then, you should use the toCharArray() method to get data out of the String, if you feel that you must abuse the Arduino by using the String class.

Result: String is send, no errors/missing chars/blanks etc

You've gotten lucky.

Hmmm ok, dont wanna argue, but the way I did it consistently yields the correct result (varying String size/chars), so lucky?? I dont know.

You really should not be defining the buffer this way, though. The buffer should be statically defined, with a set size. Then, you should use the toCharArray() method to get data out of the String, if you feel that you must abuse the Arduino by using the String class.

This will be rather difficult, as the message is intended to be a collection of address and measurements from various sensors, which will vary in length. I could take the trouble of formatting every number to a fixed length, but my program is rapidly approaching the memory limit of the Arduino FIO.

Besides, what precisely is the problem/danger in letting the array length depend on String length dynamically, as opposed to static length? And I take it from your reply using String is a bad idea?? Its just that it makes compiling the message easier, as it is composed of different inputs (chars/ints/floats)...

EDIT: PaulS, I tested youre recommendation (which was the last possible combination), and Im sorry to say you're wrong.

With array length+1 and geBytes length, the last character of the string is dropped. The extra byte in the array is filled with random ASCII chars....

perjarle:
I am using Arduino IDE 1.0.1 for Windows

Please note that in versions of the IDE up to and including 1.0.3, the String library has bugs as discussed here and here.

In particular, the dynamic memory allocation used by the String class may fail and cause random crashes.

I recommend reworking your code to manage without String. Use C-style strings instead (strcpy, strcat, strcmp, etc.), as described here for example.

Alternatively, install the fix described here: Fixing String Crashes

Preferably upgrade your IDE to version 1.0.4 or above at: http://arduino.cc/en/Main/Software

Im thinking that maybe instead of explaining how im doing things, I should explain WHAT im trying to do.

Task
-Remote wireless sensor modules take periodic (10 min) measurements, which are send over XBEE radios, eventually going into SQL databases/GUI's etc.
-Transmitted data consists of ints/string/floats of varying sizes. Total payload size is roughly 40-50 bytes (/10 min)
-The base (PC) of the network runs a JAVA program for receiving/decoding the XBEE packets (API mode for those interested, using the XBEE API on both Arduino as well as PC side)

Current solution
Current solution is to convert and concat all data to a string, perform getBytes() and fill the XBEE payload (uint8_t[]) with these bytes.

Why use String?
There are a number of reasons for me to use String. Note that although experienced with Arduino, Im no wizard or something :relaxed:
-Easy to add info (in String format) and easy to concat the arrays
-Easy to extract the bytes and populate payload
Because the payload is of type uint8_t type, it can be filled simply by the String.getBytes() method. Manually inserting ints in the array is even pointless, Id have to convert them to individual bytes, and decode/convert everything on the other side again.
-Easy to send and decode the float measurements.
In my JAVA programming experience, I've had the "pleasure" of reading out sensor registers which stored floats in their original 4 bytes, and had to decode them manually. This was quite frustrating, with the endian and signed/unsigned nature of the bytes involved. Somehow the built in methods didnt work....

Now with Arduino, instead of adding the four bytes to the payload, I just convert the float to String, sdend over the char, and decoding becomes easy at the JAVA side, or even unnecessary (as SQL querys are String as well).

Additionally, the dtostrf(float,dec,min.width,string); method gives the opportunity to trim the floats and specify decimals.

Alternative?
Im sure there are alternatives, but I wouldnt know them (yet), and as this concerns a prototype, im happy it works the way I want it to. If you guys have any suggestions, you're welcome. Ive learned to program mainly in JAVA (LabView before that :sweat_smile:), so quite high level. Although Im confident I can program what I want and need on Arduino, the really lower stuff like malloc...I know what it does and could work with it given some time, but I dont fully understand all the possibilities and limitations.

So lets say that, going by previous comments, I fix the payload array length. Knowing I send over 40-50 bytes only once every 10 minutes, what changes are really worth it, if at all?

I appreciate your thoughts.

jzuidema:
Hmmm ok, dont wanna argue, but the way I did it consistently yields the correct result (varying String size/chars)

Except it doesn't, does it? You said that one of your attempts truncated the string, and the other one appended a spurious space. Both of these are direct consequences of the bugs in the way you declare the array and copy the received bytes in it.

I suggest you drop all use of the String class and use plain old c-strings (null-terminated char arrays). It's no harder than using Strings, avoids the possibility of memory corruption introduced by the String class, and avoids the messy copying between Strings and c-strings which is causing you the problem here.

PeterH:

jzuidema:
Hmmm ok, dont wanna argue, but the way I did it consistently yields the correct result (varying String size/chars)

Except it doesn't, does it? You said that one of your attempts truncated the string, and the other one appended a spurious space. Both of these are direct consequences of the bugs in the way you declare the array and copy the received bytes in it.

I suggest you drop all use of the String class and use plain old c-strings (null-terminated char arrays). It's no harder than using Strings, avoids the possibility of memory corruption introduced by the String class, and avoids the messy copying between Strings and c-strings which is causing you the problem here.

You didnt understand me than. I told of various combinations of using length() and length()+1 in declaring the array and filling it, with varying results. I also told that my combination (array length(), getBytes() length+1) worked well consistently, and still does.

What didnt work was the advice given by PaulS, which resulted in empty trailing bytes with gibberish (array length()+1, getBytes() length()). So I still dont understand why I should not be using it this/my way, when the adviced ways, which apparently should result in the proper behavior, dont work. Note Im using 1.04, so the String errors should be fixed right?

As for using char arrays, Ill look into that in a later stage. Aside from memory use, I dont really see, even with given advice in this thread and elsewhere, why I should switch perse?! Anyways Ill look into ways of replacing the methods I use for String (getBytes()/concat etc) for methods using char arrays if you guys insist... :slight_smile:

The bug may be gone but there is such a thing as memory fragmentation. With only 2 Kb of RAM this isn't too good a thing to be courting.

With a lot of use of String, including concatenation, you may well end up with fragmentation. It isn't a bug and can't be fixed as such.

Parsing serial input is quite straightforward. Try reading this:

jzuidema:
You didnt understand me than. I told of various combinations of using length() and length()+1 in declaring the array and filling it, with varying results. I also told that my combination (array length(), getBytes() length+1) worked well consistently, and still does.

What didnt work was the advice given by PaulS, which resulted in empty trailing bytes with gibberish (array length()+1, getBytes() length()). So I still dont understand why I should not be using it this/my way, when the adviced ways, which apparently should result in the proper behavior, dont work. Note Im using 1.04, so the String errors should be fixed right?

It's unfortunate that the people who wrote the documentation for the Arduino runtime environment didn't take the trouble to document the API properly so we need to look at the source code to find out what these methods actually do.

When you declare the char array to hold the string content you call the String::length() method. This returns the length excluding the null terminator. (This is the same behaviour as strlen(); it's not unreasonable, but note that your char array needs to be one char longer that this in order to hold the null terminator. Your payload array should be declared as:

uint8_t payload[message.length()+1];

When you copy the characters out of the message, you should supply the actual length of the char array. String::getBytes will populate the first n-1 bytes with the string content and then append the null terminator. If you supply a length which is less than message.length() + 1 then the string would be truncated. If you supply a length which is greater than the actual length of the array then getBytes() may trample over memory outside the bounds of your payload array.

All of these tedious issues would disappear completely if you simply get rid of all uses of String and use c-strings in the first place. Unless you come from a Java background and want an object oriented security blanket to hide you from those scary char arrays, there is no reason for the String class to exist as far as I see. (I am very familiar with Java, and I can see the lengths the Arduino developers have gone to to make C++ behave more like Java, and their reasons for doing so are IMO completely unsound. This sort of stuff is introducing more problems than it solves.)