Problem with comparing Strings from Serial

system · September 23, 2014, 5:53am

Hi!

I have a Serial connection from which I read characters. After reading 4 characters, I want to compare them to a String.
The message I'm sending is "", but because of start and end characters (explained in the code) the message I want for Arduino to receive after the reading method is "blue", however, it doesn't seem to work.

int i=Serial.read(); // checks the first character
if...// some if statements

else if (i == 60) { // the messages have a "start" character and an "end" character, so the message sent looks like "<blue>",  and   this else if checks whether or not the first character is '<'
    char input[4]; // the message is always 4 characters long

 while(j<4){
    	if(Serial.available()>0){ // reads the message into the buffer
    		input[j]=Serial.read();
    		j++;
    	}
    }
    i = softSerial.read(); // gets rid of the end character, redundant for now

    String s(input); 
    Serial.println(s);
    Serial.println(s.length());

    if(s.equals("blue")){
    	Serial.println("ITS BLUE!");
    }
    if(strcmp(input, "blue")==0){
    	Serial.println("ITS BLUE!");
    }
  }

The serial output from the commands before the if statements is the following:

blue

4

... but none of If statements are "true" unfortunately, even though the message received seems to be correct. How can I fix this?

BulldogLowell · September 23, 2014, 6:50am

to compare strings with a small "s" you could use strncmp(string1, string2, num)

if (strncmp(s, "blue", 4) == 0) { // returns zero if both are equal...

where num is the number of positions to compare

system · September 23, 2014, 7:25am

Isn't the default behaviour to compare the whole string if a num argument isn't passed?

nickgammon · September 23, 2014, 7:49am

    char input[4]; // the message is always 4 characters long

In which case you need one more byte for a null terminator, and to put it there after 4 bytes are received.

system · September 23, 2014, 8:03am

A string must be terminated by a zero. I am not sure yours are.

Sorry I got irritated by,
int i = Serial.read(); and did not go much further.

Serial.read returns a char, so you should assign it to a char, to avoid introducing bugs which are difficult to see.

char ch = Serial.input();

You can then write conditions which are easy to read, rather than the sprawling comments to explain them.

if (ch == '<' ) startChar = true;

And...Why not use the delimiters in the comparison?

if ( strcmp(input, "<blue>") == 0) gotBlue();

But, strings still have to be terminated by a zero.

system · September 23, 2014, 8:11am

Serial.read returns a char, so you should assign it to a char, to avoid introducing bugs which are difficult to see.

It does NOT. It returns an int, so that it can tell you that there was nothing to read, when you try to read without checking, as OP does. The code in OP's post IS correct. Yours is not.

And...Why not use the delimiters in the comparison?

if ( strcmp(input, "") == 0) gotBlue();

Because the start and end markers are not being stored.

But, strings still have to be terminated by a zero.

And THAT is why OP's tests fail.

system · September 23, 2014, 10:12am

PaulS:

Serial.read returns a char, so you should assign it to a char, to avoid introducing bugs which are difficult to see.

It does NOT. It returns an int, so that it can tell you that there was nothing to read, when you try to read without checking, as OP does. The code in OP's post IS correct. Yours is not.

How many bytes does Serial.read() return, one or two?

And...Why not use the delimiters in the comparison?

if ( strcmp(input, "") == 0) gotBlue();

Because the start and end markers are not being stored.

Well I could just keep responding 'why' but it would not be very constructive.

The delimiters have to be read, so they could be stored. A tried and trusted technique for processing an asynchronous comms stream, is to buffer at least as much of the stream as it takes to hold an envelope and the message it contains. By looking for the envelope, you infer the message inside is complete, before sending it to the next block of logic. In this case the envelope is being created by the delimiters.

But, strings still have to be terminated by a zero.

And THAT is why OP's tests fail.

Yes but it is far from being the only issue with his code and coding style.

system · September 23, 2014, 10:28am

How many bytes does Serial.read() return, one or two?

If there is data to read, the data is in the low order byte of a two byte variable.

Well I could just keep responding 'why' but it would not be very constructive.

They are not stored because there is no reason to store them. When the end one arrives, it makes little difference if that was stored includes them or not. Since the reason for sending packeted data was to use the value on a PWM pin, the array needs to be converted to an int. If the array contains "128", this is easy. If it contains "<128>", is it more difficult. On a scale of 1 to 10, it's only about a 1.1, but since not storing the start and end of packet markers is so easy, why make life more difficult than needed?

as it takes to hold an envelope and the message it contains. By looking for the envelope, you infer the message inside is complete, before sending it to the next block of logic. In this case the envelope is being created by the delimiters.

If you KNOW what defines the envelope, there is no reason to store it.

Now, if the envelope was defined by and , you'd have a good point that it was important to store the envelope data, so you could also determine the envelope type. But, when there is only one envelope type, your argument looses starch.

system · September 23, 2014, 1:22pm

PaulS:

How many bytes does Serial.read() return, one or two?

If there is data to read, the data is in the low order byte of a two byte variable.

Am I supposed to laugh at that?

The documentation says, Serial is derived from stream, which reads characters,

Indeed, Serial.read() will return you 8 bits, which is a byte, which is type equivalent to a char. Pretty much the same behaviour you would expect from every other arbitrary stream class you might come across, in whatever programming environment you came across it.

For a moment there I thought the Arduino might be doing something horribly idiosyncratic but no, it's just your explanation which is out of kilter. The Arduino returning -1 (0xFF) to indicate an empty buffer is a bit odd, as it merely duplicates Serial.available() and creates a need to further encode messages containing 0xFF.

They are not stored because there is no reason to store them.

Do you mean you can construct an argument in which there is no reason to store them? Yes, I agree you can do that.

Since the reason for sending packeted data

I have never heard anyone who works in data comms say "packeted." Packetised, or packetized if you must.

If the array contains "128", this is easy. If it contains "<128>", is it more difficult. On a scale of 1 to 10, it's only about a 1.1, but since not storing the start and end of packet markers is so easy, why make life more difficult than needed?

A nicely constructed argument. However, the stream contains "" and making the array contain "blue" will cost you conditional operations. I think your way is 1.4 more difficult.

But, when there is only one envelope type, your argument looses starch.

I did not respond to the original post, to try to score points in an argument.

1.4 is just a number I made up, by the way.

system · September 23, 2014, 1:50pm

In HardwareSerial.h:

    virtual int read(void);

So, Serial.read() returns an int. It has to, because EVERY value in a byte could be a legal value, and the read() method needs some way to tell you that the function couldn't read anything because there was nothing to read.

Indeed, Serial.read() will return you 8 bits,

This is where you are flat assed wrong. It returns 16 bits, so it can encode an error value in the high order byte. If it worked, the high order byte is 0, and the data of interest is in the low order byte. If it failed, the error value is in the high order byte, and the low order is junk.

it's just your explanation which is out of kilter.

No. It is your understanding that is out of kilter.

The Arduino returning -1 (0xFF) to indicate an empty buffer is a bit odd, as it merely duplicates Serial.available() and creates a need to further encode messages containing 0xFF.

So, what is it supposed to do if the user doesn't use Serial.available()? Pretend that all is right with the world? Fat chance of that being the right answer.

nickgammon · September 23, 2014, 8:44pm

MattS-UK:
Am I supposed to laugh at that?

The documentation says, Serial is derived from stream, which reads characters,
Serial.read() - Arduino Reference
Stream.read() - Arduino Reference

It's always wise to go back to the source before making claims like that:

class HardwareSerial : public Stream
{
...
    virtual int peek(void);

Never mind the documentation, that's been known to be vague or wrong.

PaulS is perfectly correct. In HardwareSerial, if a read cannot read anything it returns -1 (0xFFFF in an int) which distinguishes it from reading 0xFF which is a perfectly valid piece of data to read:

int HardwareSerial::read(void)
{
  // if the head isn't ahead of the tail, we don't have any characters
  if (_rx_buffer->head == _rx_buffer->tail) {
    return -1;
  } else {

In fact the documentation you quoted at Serial.read() - Arduino Reference says:

Parameters

None
Returns

the first byte of incoming serial data available (or -1 if no data is available) - int

So the documentation correctly reports that read returns an int.

mrburnette · September 24, 2014, 12:52am

I pretty much do the same in my serial calculator:
http://forum.arduino.cc/index.php?topic=147550.0

Ray

zoomkat · September 24, 2014, 1:01am

Some simple serial String comparison code.

// zoomkat 8-6-10 serial I/O string test
// type a string in serial monitor. then send or enter
// for IDE 0019 and later

int ledPin = 13;
String readString;

void setup() {
  Serial.begin(9600);
  pinMode(ledPin, OUTPUT); 
  Serial.println("serial on/off test 0021"); // so I can keep track
}

void loop() {

  while (Serial.available()) {
    delay(3);  
    char c = Serial.read();
    readString += c; 
  }

  if (readString.length() >0) {
    Serial.println(readString);

    if (readString == "on")     
    {
      digitalWrite(ledPin, HIGH);
    }
    if (readString == "off")
    {
      digitalWrite(ledPin, LOW);
    }

    readString="";
  } 
}

system · September 24, 2014, 11:32pm

True.

HardwareSerial.cpp:
// ...We're
// using a ring buffer (I think), <--just LOL.

It would be wise for me to be less flippant at times too but hey, I am not at work here.

I identified the OPs issue on the first line of my first post on this thread, by the way.

PaulS is perfectly correct.

I do have to laugh at the response my half dozen lines, to a beginner's question, got me from PaulS, who I willingly concede knows more about C than I do. PaulS is literally correct but practically, I think he is correct, in the same way someone repeating a poor bit of advice word for word, is perfectly correct.

To answer PaulS question directly, not checking Serial.available() should not be an acceptable answer. You might recall the Heart-Bleed exploit was caused by a developer neglecting to check the length of a buffer.

In HardwareSerial, if a read cannot read anything it returns -1 (0xFFFF in an int) which distinguishes it from reading 0xFF which is a perfectly valid piece of data to read:[/quote]

I prefer to think of it the same as reading any other sort of array.  Valid data is distinguished by it's presence within the boundaries of the array.  Attempting to read beyond the extents of an array or a buffer, creates an error condition which produces  garbage.

The salient function in it's entirety.
[code]
int HardwareSerial::read(void)
{
  // if the head isn't ahead of the tail, we don't have any characters
  if (_rx_buffer->head == _rx_buffer->tail) {
    return -1;
  } else {
    unsigned char c = _rx_buffer->buffer[_rx_buffer->tail];
    _rx_buffer->tail = (unsigned int)(_rx_buffer->tail + 1) % SERIAL_BUFFER_SIZE;
    return c;
  }
}

The only time the int is of any significant is after an error has occurred. Attempting to read beyond the extents of the buffer, creates an error condition and we end up with garbage corrupting the data stream.

So the documentation correctly reports that read returns an int.

Fair enough, I am wrong but am I wrong? I am going to stick with reading streams as they were written, in the order they were written and never more than was written. I remain irritated by int i = byteStream.read() which, in my opinion, breaks the stream paradigm. I think we are well beyond helping beginners at this point though.
[/code]

system · September 25, 2014, 11:20am

To answer PaulS question directly, not checking Serial.available() should not be an acceptable answer.

I never said it was. All I did was point out that Serial.read() has no way of knowing whether the caller checked that there was something to read, so it MUST have the ability to say "Hey, stupid, there was nothing to read".

The only time the int is of any significant is after an error has occurred. Attempting to read beyond the extents of the buffer, creates an error condition and we end up with garbage corrupting the data stream.

Reading beyond the end of the buffer is not possible. The read() function doesn't permit that. Even if it were possible, doing so does not corrupt the data. It simply returns garbage to the caller.

Fair enough, I am wrong but am I wrong?

Yes. You are failing to consider that the read() method can not know that the user is intelligent (and called read() only if available() returned a positive value) or is an idiot (and didn't bother checking available()).

nickgammon · September 25, 2014, 11:27am

MattS-UK:
I remain irritated by int i = byteStream.read() which, in my opinion, breaks the stream paradigm.

I don't know what is so irritating about that, because fgetc from the Standard C library does the same thing:

http://www.cplusplus.com/reference/cstdio/fgetc/

The return type is int to accommodate for the special value EOF, which indicates failure:

Woe betide the programmer who doesn't read the documentation, nor understand why the return type is int rather than char.

system · September 27, 2014, 1:29pm

PaulS:
All I did was point out that Serial.read() has no way of knowing whether the caller checked that there was something to read, so it MUST have the ability to say "Hey, stupid, there was nothing to read".

There is no MUST about it. You are rationalising one particular behaviour, in one particular way. It is not the only way. In a different programming environment a similar failure mode might produce an access violation, or it might return 0 and set an error flag in a stream reading object.

I regard the int -1 as (just) garbage, so the behaviour of a stream remains conceptually consistent, wherever the stream happens to be read from, whichever programming environment I happen to be reading it with. The value of garbage is never significant.

The only time the int is of any significant is after an error has occurred. Attempting to read beyond the extents of the buffer, creates an error condition and we end up with garbage corrupting the data stream.

Reading beyond the end of the buffer is not possible.

I said it was the 'attempt' to read, which creates the failure mode, which produces garbage; the distinction is quite important. The stream was produced by writing byte values. Any int value appearing in the stream when it is read is, (pretty obviously,) a corruption of the data stream.

The read() function only gets away with assigning an unsigned char to an int because C is so loosely typed. However, because C is so loosely typed, I can still presume the read() function returns a byte and over running the buffer produces garbage.

Fair enough, I am wrong but am I wrong?

Yes. You are failing to consider that the read() method can not know that the user is intelligent (and called read() only if available() returned a positive value) or is an idiot (and didn't bother checking available()).

I am not failing to consider anything. What you are saying here makes very little sense though. The intelligence needed to check for an int being returned is about the same as what's needed to check serial.available() What differs is how we, as developers, visualise the problem and the solution. I find the concept of a stream being a stream, too useful to want to break it.

Have a read of this

Your way of looking at things is not going to be compatible with the 2nd paragraph.

I am happy to agree we disagree, at this point.

system · September 27, 2014, 2:04pm

In a different programming environment

When you have a different programming environment for the Arduino, we can discuss what Serial.read() might do in that environment.

or it might return 0

0 is a perfectly valid value. It can not be used to indicate an invalid value.

I am not failing to consider anything.

Sure you are. You are failing to consider to limitations of the Arduino (and other systems that do not have exception handling capabilities).

I am happy to agree we disagree, at this point.

Fine.

nickgammon · September 27, 2014, 9:35pm

MattS-UK:
I regard the int -1 as (just) garbage, so the behaviour of a stream remains conceptually consistent, wherever the stream happens to be read from, whichever programming environment I happen to be reading it with. The value of garbage is never significant.

It's not garbage, because the function is defined to return -1 if there is no data. Thus it is very significant. You could argue that -2 is garbage, because the function is not defined to return that.

In my copy of the K&R book (2nd edition, page 151) they mention "a text stream" (so we are talking about streams here) and then goes on to say that "getchar returns the next input character each time it is called, or EOF when it encounters end of file."

Now of course, serial input on the Arduino does not have an "end of file" as such, but the concept of a stream of characters being returned (ie. 0x00 to 0xFF) along with an "out of band" character, namely EOF or -1, seems to be established in computing since 1978.

The read() function only gets away with assigning an unsigned char to an int because C is so loosely typed.

It is absolutely nothing to do with that. There is no objection to assigning a char to an int. The values 0 to 255 can fit into a variable which can hold -32768 to +32767. What is the problem?

Topic		Replies	Views
Read and compare string form serial port Programming	40	61875	May 5, 2021
Reading text from serial and compare the input always fails Programming	17	219	August 30, 2025
Simple char array wierdness Programming	22	229	June 3, 2025
Char is not signed - no reference in the documentation Nano ESP32	69	303	September 3, 2024
Moving from Serial.readstring() to Serial.read() for strings Programming	37	2955	February 14, 2023

Problem with comparing Strings from Serial

Related topics