How to slice/split a string by delimiter

Hello so im building a web server and im wanting to get just the first line of my client serial header thing (the GET url information part)

So i'm just trying to split the header at the first \n and discard everything else in the header after it but i can not figure out how.

in normal C++ i could do this using

std::string mystr = header.substr(0, header.find("/n", 0));

but that dose not work in arduino IDE.

please help
Thakyou

If you want to use the Arduino version of the String class, look at String() - Arduino Reference

If you want to use the Arduino version of the String class, you should have your head examined, and, if you persist in the fantasy that the Arduino has oodles of memory, look at String() - Arduino Reference

There, I fixed that for you.

PaulS:

if you want to use the Arduino version of the String class, you should have your head examined, and, if you persist in the fantasy that the Arduino has oodles of memory, look at String() - Arduino Reference

There, I fixed that for you.

I beg to differ. This advice was OK when most users were using an Arduino say of the AVR class with 1 or 2 k of RAM. Now the Arduino IDE supports devices with around 80k of RAM like ESP or ARM class processors. This will only improve.

Anyway, the Arduino core web server classes for the ESP8266 use the String class, so you are stuck with it even if you don't like it.

I know that some people have difficulty accepting that all their old knowledge is now of no use. All that desperate fiddling about with character arrays to try to extract a few bytes of data etc. I've seen this phenomenon in industry when I still had to work with people clinging to their old technologies like Cobol, IBM mainframes, Lotus Notes etc. etc. instead of accepting that times have moved on and learning something new.

Any, lets face it, the Arduino environment was designed to support even people who just want to light a few leds and basic things like that and why should they be forced to use tortuous methods to achieve simple things ?

6v6gt:
If you want to use the Arduino version of the String class...

Wow. No thanks.

There is no better way to shoot yourself in the foot than by using String. Not only that, you'll wonder where the shot came from. It comes at random times. Sometimes, it manifests as a different injury. Fun.

benskylinegodzilla:
split the header at the first \n and discard everything else

I could answer this specific question with code like this:

 char header[MAX_HEADER_SIZE];

void something()
{
  // Find a specific char in the header array
  char *newlinePtr = strchr( header, '\n' );

  if (newlinePtr != nullptr) {
    // It's pointing at the newline char

    *newlinePtr = '\0'; // terminate the string here

    // Later characters are still in the array, but
    //   this C string (NOT the String class) has been
    //   terminated at the first newline char.
    // Other C string functions will stop here,
    //   ignoring the remaining contents.
  }

However, this is probably the wrong question.

In this environment, it is very unlikely that you can save an entire "thing" and then process it all at once. Because there is limited RAM, you will almost certainly have to process "things" as they are received, gradually accumulating only a few key pieces of information. BONUS: this approach is also faster.

So I have to ask:

  1. What is in the header that you really want?

  2. Do you have some code that is reading this stream of bytes?

Just use strtok (STRing TOKen).

6v6gt:
I beg to differ. This advice was OK when most users were using an Arduino say of the AVR class with 1 or 2 k of RAM. Now the Arduino IDE supports devices with around 80k of RAM like ESP or ARM class processors. This will only improve.

Then you would be wrong. This is not just my opinion, this is an industry position. You've been here long enough to have seen this information. Microsoft says "No, not even on servers with GB of RAM." If nothing else, read The Evils of Arduino Strings.

Still think it's ok to use the String class? Willful Ignorance.

I know that some people have difficulty accepting that all their old knowledge is now of no use.

This is not an attempt to preserve "old ways". If you read the information above, you'll see that it all says to "move on" from random-size dynamic memory allocations that are used by String.

This is an attempt to help new users avoid the frustration of going down the wrong path.

the Arduino environment was designed to support even people who just want to light a few leds and basic things like that

Implementing a web server is quite a bit more complicated than lighting a few LEDs. The required libraries use significant portions of RAM, leaving too little room for String to maneuver.

With more information from the OP, we can suggest any number of "modern" strategies: Finite-state Machines, memory pools, string classes with deterministic behavior, streaming operators, etc. Until then, recommending a basic approach is all we can do.

Cheers,
/dev

Microsoft says "No, not even on servers with GB of RAM."

I don't believe that this is a valid argument. I doubt if Microsoft has cribbed the Arduino implementation of the String class to use in its library functions.

Implementing a web server is quite a bit more complicated than lighting a few LEDs. The required libraries use significant portions of RAM, leaving too little room for String to maneuver.

The esp8266 Arduino core web server class for example does use the String class. See here.

Anyway, the same arguments apply to anything which makes the run time unpredictable which would rule out recursion, automatic variables, the new operator etc. etc. and as the amount of available RAM increases with new processor models, so these arguments become less relevant.

But I see this all as a religious argument and of course I've seen most of those links. It's like listening to people complaining about new design rules which prevent them creating huge suites of IBM 360 Assembler code with all the arguments about efficiency and that a high level language compiler can't be trusted to produce optimal code etc. etc.

One thing I am certain about is that similar discussions will continue to rage on until the older generation of programmers has died out to be replaced with a younger generation who want nothing more complicated than dragging and dropping icons to create code.

One thing I am certain about is that similar discussions will continue to rage on until the older generation of programmers has died out to be replaced with a younger generation who want nothing more complicated than dragging and dropping icons to create code.

It is unfortunate that all that that generation WILL be able to do is drag-and-drop existing code blocks. Forget about creating anything new, then.

I don't believe that this is a valid argument. I doubt if Microsoft has cribbed the Arduino implementation of the String class to use in its library functions.

Then you haven't read the article. It is referring to the underlying dynamic memory, which is used by the String class. Belief and doubt have no place in rational discussion. Unless....

I see this all as a religious argument

Oh. I thought this was a factual discussion, with supporting material from a variety of sources.

the same arguments apply to anything which makes the run time unpredictable which would rule out recursion, automatic variables, the new operator etc.

Then you really don't understand determinism. Recursion, auto variables and the new operator can all be used in a deterministic way. Random-sized heap allocation and unlimited growth is not deterministic.

until... replaced with a younger generation who want nothing more complicated than dragging and dropping icons to create code.

Then why complain when we offer something more complicated, yet more stable and efficient? Fight the future!

@benskylinegodzilla, it is not a good idea to use the String (capital S) class on an Arduino as it can cause memory corruption in the small memory on an Arduino. Just use cstrings - char arrays terminated with 0.

The parse example in Serial Input Basics should get you started.

...R

-dev:
In this environment, it is very unlikely that you can save an entire “thing” and then process it all at once. Because there is limited RAM, you will almost certainly have to process “things” as they are received, gradually accumulating only a few key pieces of information. BONUS: this approach is also faster.

So I have to ask:

  1. What is in the header that you really want?

  2. Do you have some code that is reading this stream of bytes?

first im only starting on the ESP8266 for the first time so I am using the references as I can find them and they were using string to store the information from the stream

1)What I want is to get the HTTP requests information from when I click any of the buttons
The value in the HTTP request is created by combining values from sliders on the page.

GET /HourRed/ColVal=200 HTTP/1.1
Host: 192.168.1.50
Connection: keep-alive
User-Agent: Mozilla/5.0 (Linux; Android 8.0.0; SM-G950F Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.109 Mobile Safari/537.36
Save-Data: on
Accept: image/webp,image/apng,image/*,*/*;q=0.8
Referer: http://192.168.1.50/HourRed/ColVal=200
Accept-Encoding: gzip, deflate
Accept-Language: en-NZ,en-GB;q=0.9,en-US;q=0.8,en;q=0.7

is what is stored in the header string (or character array if i work out how to use that with the ESP) I just want the part
GET /HourRed/ColVal=200 HTTP/1.1

where 200 could be any number between 0-255

2)Yes there is a stream that gets the information
The client is reading the http request byte by byte

Ill provide a snippet of the server update function that runs while a client is connected because i’m still learning how some of it works. and don’t know what parts will and will not be useful. and it will explaine it better then i could about what is going on.

void ClockServer::serverUpdate()
{
  WiFiClient client = server.available();   // Listen for incoming clients

  if (client)   // If a new client connects,
  {
    Serial.println("New Client.");          // print a message out in the serial port
    String currentLine = "";              // make a String to hold incoming data from the client
    String header;
    while (client.connected())              // loop while the client's connected
    {
      if (client.available())               // if there's bytes to read from the client,
      {
        char c = client.read();             // read a byte, then
        Serial.write(c);                    // print it out the serial monitor
        header += c;
        if (c == '\n')                      // if the byte is a newline character
        
        {
          // if the current line is blank, you got two newline characters in a row.
          // that's the end of the client HTTP request, so send a response:
          if (currentLine.length() == 0)
          {
            // HTTP headers always start with a response code (e.g. HTTP/1.1 200 OK)
            // and a content-type so the client knows what's coming, then a blank line:

            Serial.println(header); //This is where I see what is in the header

           client.println("HTTP/1.1 200 OK");
           client.println("Content-type:text/html");
           client.println("Connection: close");
           client.println();
          


          //HTML body Code goes here
          client.println("<!DOCTYPE html>");
          client.println("<html>");
         ...


      This is where all the HTML code is  
      there is one line that says 
      client.println("    <a href=\"/HourRed/\" onmouseup=\"location.href=this.href+'ColVal='+sliderHourRed.value;return false;\">");
      that sends the http request
       ...

          //END OF HTML
          // The HTTP response ends with another blank line
          client.println();
          
         // Break out of the while loop
          break;
          
          } else // if you got a newline, then clear currentLine
          {
            currentLine = "";
          }

        } else if (c != '\r')     // if you got anything else but a carriage return character,
        {
          currentLine += c;      // add it to the end of the currentLine
        }
      }
      Serial.println("Client stillConnected");
    }
  // Clear the header variable
  header = "";
  // Close the connection
  client.stop();
  Serial.println("Client disconnected.");
  Serial.println("");
  }
}

and btw.
OK I don't know what everyone is going on about with arduino strings and running out of memory but i have changed my Strings to char*
is that any better

I guess the easiest is to use the arg() method of the ESP8266WebServer class (if you are already using that class - it’s not clear from your code extract)
This parses a standard formatted query string list.
You’ll have to change this line:

client.println("    <a href=\"/HourRed/\" onmouseup=\"location.href=this.href+'ColVal='+sliderHourRed.value;return false;\">");

to yield a URL part in the format:
/HourRed?ColVal=200

Then you can use the sample code here:

There are of course many other ways of doing this.

It is unfortunate for you that your post got caught up in the eternal discussion here about the String class.

Edit:
Incidentally, the arg() method returns a value of type String.

Next I guess, we'll be audience to a discussion of why all C code should be replaced by C++. Ruby On Rails was the next greatest thing too, wasn't it? The new generation doesn't know how to develop programs, they are taught boiler-plate coding, which is essentially what all the latest and greatest languages are. There are still only 6 that compile and when you can fit all that 'free memory' on the head on a pin I'll start using it. Even the M0, this week's heart-throb, only gives you 32K flash and 8K RAM. Not enough for memory-hungry applications.

Getting back to the OP’s issue, the attached is a String handler that I used some time ago for parsing CSV GPS data. You could tweak it to deal with other separators.

It presumes,

byte commaCount;
String msgField[25]; → large enough to handle the guessed at number of fields in a full message
String fullMsg = “”;

and that whatever you wish to parse is in fullMsg.

//===============================================
void getCSVfields()
{
    byte sentencePos =0;
    commaCount=0;
    msgField[commaCount]="";
    while (sentencePos < fullMsg.length())
    {
        if(fullMsg.charAt(sentencePos) == ',')
        {
            commaCount++;
            msgField[commaCount]="";
            sentencePos++;
        }
        else
        {
            msgField[commaCount] += fullMsg.charAt(sentencePos);
            sentencePos++;
        }
    }
}
//===============================================

DKWatson:
Ruby On Rails was the next greatest thing too, wasn't it?

Isn't it still?

To be honest I reckon they could have stopped at the version just before V2 and it would have been good enough and much less complex.

...R

DKWatson:
String msgField[25];
String fullMsg = “”;

Not just String, not just arrays of String, but unsafe array usage. Sigh. If only there were a better way to parse GPS data…

@benskylinegodzilla, here are two ways to watch for the magic ColVal in the incoming stream of bytes. Both sketches use Serial for the client so you can simply paste the HTTP request into the Serial Monitor window and press Send.


Line-oriented processing with C string functions

One line is accumulated at a time. Then the line is searched with strstr for the special match string. If a match is found, it steps over the match string to the value characters and uses atoi to convert the digits into an int.

Once the match string and color value are found, it skips the rest of the request until the two newlines are found.

#define client Serial

void setup()
{
  client.begin( 9600 );
}

const char     match[]       = "GET /HourRed/ColVal="; // watch for this
      uint8_t  redColorValue = 0;



void loop()
{
      if (client.available())
      {
        char c = client.read();
        Serial.write(c); // echo for testing

        if (parseRequest( c )) {
          sendResponse();
        }
      }

}


//  Some variables to receive a line of characters
const size_t   MAX_CHARS          = 64;
      char     line[ MAX_CHARS ];
      size_t   count              = 0;
      bool     previousWasNewline = false;
      bool     colorReceived      = false;

bool parseRequest( char c )
{
  bool isNewline = (c == '\n');

  //  Only pay attention until a color value is received.  Then ignore.
  if (not colorReceived) {

    if (not isNewline) {
      // Only save the printable characters, if there's room
      if ((' ' <= c) and (count < MAX_CHARS-1)) {
        line[ count++ ] = c;
      }

    } else {
      //  The newline is here, line completely received
      line[count] = '\0'; // terminate the string
      count       = 0;    // reset for next time

      // See if it is the special string
      char *found = strstr( line, match );
      if (found != nullptr) {

        //  The value is after the match string.  Start at the
        //     found position and step by the match string length.
        char *colorValuePtr = &found[ sizeof(match)-1 ];

        // Convert the next characters to an int value.
        //    atoi stops at the first non-digit.
        redColorValue = atoi( colorValuePtr );
        colorReceived = true;

        // Do something with the value now?  Look at it later?
        Serial.print( "\nRed Color Value = " );
        Serial.println( redColorValue );
      }
    }
  }

  bool done = (isNewline and previousWasNewline) and colorReceived;

  if (done) {
    // Reset a few things
    colorReceived      = false;
    previousWasNewline = false;
  } else {
    previousWasNewline = isNewline; // remember for next time
  }

  return done;

} // parseRequest

void sendResponse()
{
  // HTTP headers always start with a response code (e.g. HTTP/1.1 200 OK)
  // and a content-type so the client knows what's coming, then a blank line:

  client.println("HTTP/1.1 200 OK\n"
                 "Content-type:text/html\n"
                 "Connection: close\n"); // one print!

  //HTML body Code goes here
  client.println("<!DOCTYPE html>");
  client.println("<html>");

    // This is where all the HTML code is
    // there is one line that says
    client.print("    <a href=\"/HourRed/\" onmouseup=\"location.href=this.href+'ColVal='+sliderHourRed.value;return false;\">");

  client.println( "</html>" );
  //END OF HTML
  // The HTTP response ends with another blank line
  client.println();

}

Finite-State Machine

Instead of accumulating an entire line, it compares each character as it arrives to a “match” string. As the characters continue to match, it increments a counter. When the entire string has matched, it knows to start watching for the color value.

Again, instead of saving the complete color value string, it accumulates the color value as the digits arrive. When a non-digit character arrives, it knows that the value is complete. The rest of the response can be ignored, until two newline characters arrive.

Each character affects the FSM as it arrives. There is no reason to hold on to the entire HTTP request string.

#define client Serial

void setup()
{
  client.begin( 9600 );
}

const char    match[]              = "GET /HourRed/ColVal="; // watch for this
      size_t  count                = 0;
      bool    previousWasNewline   = false;
      uint8_t redColorValue        = 0;

enum parsingState_t { WAITING_FOR_MATCH, GETTING_VALUE, VALUE_READY };
parsingState_t state = WAITING_FOR_MATCH;


void loop()
{
      if (client.available())
      {
        char c = client.read();
        Serial.write(c); // echo for testing

        if (parseRequest( c )) {
          sendResponse();
        }
      }

}


bool parseRequest( char c )
{
  bool isNewline     = (c == '\n');

  switch (state) {

    case WAITING_FOR_MATCH:

      if (c == match[ count ]) {

        count++;
        if (count == sizeof(match)-1) {
          // FOUND the match string
          count         = 0;   // reset for next time
          redColorValue = 0;   // initial value
          state         = GETTING_VALUE;
        }

      } else {
        count = 0; // didn't match, start over
      }
      break;


    case GETTING_VALUE:

      if (isdigit(c)) {
        // Accumulate the red color value
        uint8_t digit = c - '0';
        redColorValue = redColorValue * 10 + digit;

      } else {
        //  It wasn't a digit, so we must have all the
        //     color digits.  The value is ready to use.
        state = VALUE_READY;

        // Do something with the value now?  Look at it later?
        Serial.print( "\nRed Color Value = " );
        Serial.println( redColorValue );
      }
      break;

    case VALUE_READY:
      // Ignore the rest of the request?

      break;
  }

  bool colorReceived;

  if (isNewline and previousWasNewline) {
    // End of request!
    colorReceived      = (state == VALUE_READY);

    // reset FSM for next request
    state              = WAITING_FOR_MATCH; 
    count              = 0;
    previousWasNewline = false;
  } else {
    colorReceived      = false;
    previousWasNewline = isNewline; // remember for next time
  }

  return colorReceived;

} // parseRequest


void sendResponse()
{
  // HTTP headers always start with a response code (e.g. HTTP/1.1 200 OK)
  // and a content-type so the client knows what's coming, then a blank line:

  client.println("HTTP/1.1 200 OK\n"
                 "Content-type:text/html\n"
                 "Connection: close\n"); // one print!

  //HTML body Code goes here
  client.println("<!DOCTYPE html>");
  client.println("<html>");

    // This is where all the HTML code is
    // there is one line that says
    client.print("    <a href=\"/HourRed/\" onmouseup=\"location.href=this.href+'ColVal='+sliderHourRed.value;return false;\">");

  client.println( "</html>" );
  //END OF HTML
  // The HTTP response ends with another blank line
  client.println();

}

In either case, watching for two newline characters is actually a simple FSM. If a non-newline character is received, a flag is cleared. If a newline is received, a flag is set. If the flag was already set, you know that two newlines were received in a row. This flag is a “state” variable.

Cheers,
/dev

Actually (as I said I once used the above) GPS data has become easy to deal with as the fields are all fixed depending on the message. The length may vary but not more that one or two characters for some fields so allowing for that, pretty simple to set up fixed length character arrays.

I've removed String from my keyboard, had to borrow one to type this message.

DKWatson:
pretty simple to set up fixed length character arrays.

Not that I think CSV field parsing is the OP’s issue… but you don’t need extra field storage, either.

I’ve removed String from my keyboard, had to borrow one to type this message.

LOL, here’s a version you can type:

      char    fullMsg[ 120 ];
const uint8_t MAX_FIELD_COUNT = 25;
      char   *field[ MAX_FIELD_COUNT ];
      uint8_t fieldCount;

void getFields( const char *delim = "," )
{
  fieldCount = 0;
  field[ fieldCount ] = strtok( fullMsg, delim );

  while (field[ fieldCount ]) {
    fieldCount++;
    if (fieldCount >= MAX_FIELD_COUNT)
      break;

    field[ fieldCount ] = strtok( nullptr, delim );
  }
}

It’s short, it’s completely safe, and it doesn’t require additional storage for the fields.

Oh yes… it’s ten times faster, saves about 1500 bytes of program space, and the sketch will run forever.

Cheers,
/dev


P.S. A complete sketch for your testing pleasure:

      char    fullMsg[ 120 ];
const uint8_t MAX_FIELD_COUNT = 25;
      char   *field[ MAX_FIELD_COUNT ];
      uint8_t fieldCount;

void getFields( const char *delim = "," )
{
  fieldCount = 0;
  field[ fieldCount ] = strtok( fullMsg, delim );

  while (field[ fieldCount ]) {
    fieldCount++;
    if (fieldCount >= MAX_FIELD_COUNT)
      break;

    field[ fieldCount ] = strtok( nullptr, delim );
  }
}

void setup()
{
  Serial.begin( 9600 );
}

void loop()
{
  if (lineReady()) {
    getFields();
    printFields();
  }
}

void printFields()
{
  for (auto i = 0; i < fieldCount; i++) {
    Serial.print( F("field[ ") );
    Serial.print( i );
    Serial.print( F(" ] = '") );
    Serial.print( field[i] );
    Serial.print( F("', as int: ") );
    Serial.print( atoi( field[i] ) );
    Serial.print( F(", or as float: ") );
    Serial.println( atof( field[i] ) );
  }
  Serial.println();
}


const size_t   MAX_CHARS = sizeof(fullMsg);
      size_t   msgLen    = 0;

bool lineReady()
{
  bool          ready     = false;
  const char    endMarker = '\n';

  while (Serial.available()) {

    char c = Serial.read();

    if (c != endMarker) {
      // Only save the printable characters, if there's room
      if ((' ' <= c) and (msgLen < MAX_CHARS-1)) {
        fullMsg[ msgLen++ ] = c;
      }
    } else {
      //  It's the end marker, line is completely received
      fullMsg[msgLen] = '\0'; // terminate the string
      msgLen       = 0;    // reset for next time
      ready        = true;
      break;
    }
  }

  return ready;

} // lineReady