Parse string and replace character

I’m tinkering with the below sketch and need to replace the “-” in the get string with “#”, and then get the desired data string out of the revised string prior to sending the final data out the serial port to a servo controller.

change (c) “GET /?-0p1555-1p500t1000xxx HTTP/1.1”
into “#0p1555#1p500t1000xxx”

The below link has the replace function, but I’m currently clueless on how to implement in c code as well as parse the string. Any help appreciated.

http://www.arduino.cc/en/Tutorial/TextString

void replace(char thisChar, char thatChar)

#include <WString.h>
#include <Ethernet.h>

byte mac[] = { 0xDE, 0xAD, 0xBE, 0xEF, 0xFE, 0xED }; //physical mac address
byte ip[] = { 192, 168, 1, 102 }; // ip in lan
byte gateway[] = { 192, 168, 1, 1 }; // internet access via router
byte subnet[] = { 255, 255, 255, 0 }; //subnet mask
Server server(84); //server port
byte sampledata=50; //some sample data â?“ outputs 2 (ascii = 50 DEC) 
int ledPin =  13; // LED pin
char link[]="http://www.scienceprog.com/"; //link data
String readString = String(30); //string for fetching data from address
boolean LEDON = false; //LED status flag
void setup(){

//start Ethernet
Ethernet.begin(mac, ip, gateway, subnet);

//Set pin 4 to output
pinMode(ledPin, OUTPUT); 

//enable serial datada print 
Serial.begin(9600); }
void loop(){

// Create a client connection
Client client = server.available();
if (client) {
while (client.connected()) {
if (client.available()) {
char c = client.read();

//read char by char HTTP request
if (readString.length() < 30) {

//store characters to string 
readString.append(c); } 

//output chars to serial port
Serial.print(c);

//if HTTP request has ended
if (c == '\n') {

//lets check if LED should be lighted
if(readString.contains("L=1")) {

//led has to be turned ON
digitalWrite(ledPin, HIGH); // set the LED on
LEDON = true;
}else{

//led has to be turned OFF
digitalWrite(ledPin, LOW); // set the LED OFF
LEDON = false; }

// now output HTML data starting with standart header
client.println("HTTP/1.1 200 OK");
client.println("Content-Type: text/html");
client.println();

//set background to yellow
client.print("<body style=background-color:yellow>");

//send first heading
client.println("<font color='red'><h1>HTTP test routines</font></h1>");
client.println("<hr />");
client.println("<hr />");

//output some sample data to browser
client.println("<font color='blue' size='5'>Sample data: ");
client.print(sampledata);//lets output some data
client.println("
");//some space between lines
client.println("<hr />");

//drawing simple table
client.println("<font color='green'>Simple table: ");
client.println("
");
client.println("<table border=1><tr><td>row 1, cell 1</td><td>row 1, cell 2</td></tr>");
client.println("<tr><td>row 2, cell 1</td><td>row 2, cell 2</td></tr></table>"); 
client.println("
");
client.println("<hr />");

//controlling led via checkbox
client.println("<h1>LED control</h1>");

//address will look like http://192.168.1.110/?L=1 when submited
client.println("<form method=get name=LED><input type=checkbox name=L value=1>LED
<input type=submit value=submit></form>"); 
client.println("
");

//printing LED status
client.print("<font size='5'>LED status: ");
if (LEDON)
client.println("<font color='green' size='5'>ON"); 
else
client.println("<font color='grey' size='5'>OFF"); 
client.println("<hr />");
client.println("<hr />");
client.println("</body></html>");

//clearing string for next read
readString="";

//stopping client
client.stop();
}}}}}

You can use the indexOf() method to locate the character to be changed (the -). Then, use the setCharAt() method to put a different character (the #) in that position.

Or, you can use the replace() method to replace all occurrences of - with #.

we need more information to help you better. Is the pattern for the input and output very consistent (that is, string lengths are always the same, and the positions of the changed characters is consistent, or if not, how dos it vary?

In general, I would:
use a loop to copy the string from one place to another. If the prolog was constant length, you can index the initial string with an offset to skip it. While you copy you can inspect the character being copied for the ‘-’ and store a ‘#’ instead. It may be that the space between the xxx and the HTTP is always there, and you can use that to terminate the loop.

so, if indeed, the “GET /?” is always exactly that, then the loop looks something like:

i=0;
while (originalString[i+6] <> ’ ‘) {
if (originalString[i+6] = “-”) finalString*="#’;*
_ else finalString = originalString[i+6];_
* i++;*
}
finalString*=0; *
warning, i typed this fast, did not try it, and often commit some syntax error, but you get the idea

Below is what may come in a get request. Probably the first thing to do is test for the "-" in the string at position 6. If it is present, then the rest of the string would be processed up to the next space. I've googled for working code with the replace() method, but haven't found any yet. I'm a little bit familiar with parsing strings in other types of programming, but currently unfamiliar with the C methods. Checked the two local book stores for C programming books but came up dry. Seems 90% of the books are for microsoft products. Guess I'll have to travel to the city stores (or Amazon).

GET /?-01500t1000 HTTP/1.1 GET /favicon.ico HTTP/1.1 GET /?L=1 HTTP/1.1 GET /?L=1 HTTP/1.1 GET /?L=1 HTTP/1.1 GET / HTTP/1.1 GET /?L=1 HTTP/1.1 GET /?L=1 HTTP/1.1

Why don't you just write a test script:

#include "WString.h"

void setup()
{
  Serial.begin(9600);

  String url[40];
  url.append("GET /?-01500t1000 HTTP/1.1");
  Serial.print("url before replace: [");
  Serial.print(url);  Serial.print("]\n");

  url.replace('-', '#');
  Serial.print("url after replace: [");
  Serial.print(url);  Serial.print("]\n");
}

void loop() {}

Cheaper than buying a book. Faster than shipping the book. And, you'd learn more.

Making progress. I've been looking at some working examples in the string.zip files to study the code layout. Your test sketch had an error, so I copy/paste the string setup from an example, put it in the code, and it works. The next task is to capture the modified center portion of the url string. Also does the "Serial.print("]\n");" append a carrage return/line feed to the printed string? The servo controller looks for the carrage return byte as the end of command marker, so it will be needed in the string sent to the serial port. The ("]\n") may also fix another broken code I'm tinkering with (does the ] escape the \ or is just a representation of a non printable character).

http://www.arduino.cc/en/Tutorial/TextString http://arduino.cc/en/uploads/Tutorial/String.zip

#include "WString.h"

void setup()
{
  Serial.begin(9600);

 // String url[40];
  
  #define maxLength 40
String url = String(maxLength);       

  url.append("GET /?-01500t1000 HTTP/1.1");
  Serial.print("url before replace: [");
  Serial.print(url);  Serial.print("]\n");

  url.replace('-', '#');
  Serial.print("url after replace: [");
  Serial.print(url);  Serial.print("]\n");
}

void loop() {}

Your test sketch had an error

It did, huh? Well, it was more to give you an idea how the replace function was to be used, rather than a functioning sketch.

Also does the "Serial.print("]\n");" append a carrage return/line feed to the printed string?

The \n is a carriage return. The ] is a printable character. I often print [ at the front and ] at the end of strings, so I can see if the string contains trailing carriage returns or spaces that might cause problems.

It did, huh? Well, it was more to give you an idea how the replace function was to be used, rather than a functioning sketch.

I tried it in the sketch > verify/compile debugger and it had the error. Being C clueless, I currently have to start with code that works and experiment with it until I get something workable.

Still stumped. The below from the TextString page seems ideal for capturing the modified portion of the url string. The WString.ccp file seems to indicate that a string is returned from the function.I think in other programming languages the string would be returned as a variable with the same name as the function. How/where is the returned string captured? Also, not sure how the integer is handled for the start and end positions per the below info. Unfortunately the author does not provide an example of this function. Any ideas?

"String substring(int beginning, int ending) - Returns a substring that begins at beginning position and ends at ending"

Below is a non functioning example of many trial and error attempts to get a satisfactory result.

#include "WString.h"

void setup()
{
  Serial.begin(9600);

 #define maxLength 80
 String url = String(maxLength);   
 String substring = String(maxLength);

  url.append("GET /?-0p1500t1000-1p2200s1000 HTTP/1.1");
  Serial.print("url before replace: ");
  Serial.print(url);
  Serial.print("\n");

  url.replace('-', '#');
  Serial.print("url after replace: ");
  Serial.print(url);
  Serial.print("\n");
  
  url.substring(6, 29);
  Serial.print(substring);
  Serial.print("\n");

}

void loop() {}

You need to have a peek at the documentation for the various functions. Notice whether they modify the instance (like replace) or return a new instance (like substring).

String substring = url.substring(6, 29);

Still no go. The below will compile, but the desired string is not returned.

#include "WString.h"

void setup()
{
  Serial.begin(9600);

 #define maxLength 80
 String url = String(maxLength);   
 String substring = String(maxLength);
 String newstring = String(maxLength);

  url.append("GET /?-0p1500t1000-1p2200s1000 HTTP/1.1");
  Serial.print("url before replace: ");
  Serial.print(url);
  Serial.print("\n");

  url.replace('-', '#');
  Serial.print("url after replace: ");
  Serial.print(url);
  Serial.print("\n");
  
  substring = url.substring(6, 29);
  newstring = url.substring(6, 29);
  
  Serial.print(substring);
  Serial.print("\n");
  Serial.print(newstring);
  Serial.print("\n");

}
void loop() {}

After adding some additional Serial.prints to your code, it looks like this:

#include "WString.h"

void setup()
{
  Serial.begin(9600);

 #define maxLength 80
 String url = String(maxLength);
 String substring = String(maxLength);
 String newstring = String(maxLength);

  url.append("GET /?-0p1500t1000-1p2200s1000 HTTP/1.1");
  Serial.print("url before replace: ");
  Serial.print(url);
  Serial.print("\n");

  url.replace('-', '#');
  Serial.print("url after replace: ");
  Serial.print(url);
  Serial.print("\n");

  Serial.print("URL length: ");
  Serial.println(url.length());
  
  substring = url.substring(6, 29);
  newstring = url.substring(6, 29);

  Serial.print("substring: [");
  Serial.print(substring);  Serial.print("]\n");
  Serial.print("newstring: [");
  Serial.print(newstring);  Serial.print("]\n");

}
void loop() {}

When I run it, I get this:

url before replace: GET /?-0p1500t1000-1p2200s1000 HTTP/1.1
url after replace: GET /?#0p1500t1000#1p2200s1000 HTTP/1.1
URL length: 39
substring: [#0p1500t1000#1p2200s100]
newstring: [#0p1500t1000#1p2200s100]

The reason that I get different results than you do is related to this thread:
http://www.arduino.cc/cgi-bin/yabb2/YaBB.pl?num=1274203956/5

The clear() method is called in several places in the String class. It resets the _length data member that the setArray method reads to determine how many characters to copy. I’ve fixed my copy of the string library. You’ll need to fix yours, too.

The changes that need to be made:

String::String(const char* bytes)
{
  if(bytes == NULL)
    bytes= "";
  _length = strlen(bytes);
  //if (_capacity < _length) {
  _capacity = _length;
  //  free(_array);
  _array = (char*)malloc(_length+1);
  //}
     
  clear();  
  [glow]_length = strlen(bytes);[/glow]
  setArray(bytes);
}

String::String(const String &str)
{
  _length = _capacity = str._length;
  _array = (char*)malloc(_length + 1);
  clear();
[glow]  _length = str._length;[/glow]
  setArray(str._array);
}

const String & String::operator=( const String &rightStr )
{
  if ( this == &rightStr )
    return *this;

  if ( rightStr._length > _capacity )
  {
    free(_array);
    _capacity = rightStr._length;
    _array = (char*)malloc(_capacity+1);
  }

  clear();
[glow]  _length = rightStr._length;[/glow]
  setArray( rightStr._array );

  return *this;
}

const String & String::operator=(const char* bytes) {
  //return *this = String(bytes);
  if(bytes == NULL)
    bytes = ""; 
  _length = strlen(bytes);
  if (_length > _capacity) {
    _capacity = _length;
    free(_array);
    _array = (char*)malloc(_length+1);
  }
  clear();
  [glow]_length = strlen(bytes);[/glow]
  setArray(bytes);
  
  return *this;
}

Success! I copied/pasted the code section into the WString.cpp file and now have the desired results. The next step is to determine the end point of the desired string section (which is actually 30 in this case) by detecting the second " " blank space in the string using the below function. There is a working example provided, so I'll tinker with that to see what I get working for my setup. Thanks!

int indexOf(char thisChar) - Returns the position of the first occurrence of thisChar

You could use indexOf to find the 1st occurrence of a character. Then, extract a substring that is the rest of the URL.

Then, use indexOf again, to find the next occurrence. Then, set that position to NULL. The substring at that point is the interesting portion of the original string.

Well, I may cheat and take a shortcut in the short term. For my use the end of the string will probably almost always be " HTTP/1.1" which occupies 9 spaces. So maybe the below simple code addition below will get the job done for web testing purposes. I'll look at the possible solutions too.

#include "WString.h"

void setup()
{
  Serial.begin(9600);

 #define maxLength 80
 String url = String(maxLength);
 String substring = String(maxLength);
 String newstring = String(maxLength);
 int pos = 0;

  url.append("GET /?-0p1500t1000-1p2200s1000-2p1500 HTTP/1.1");
  Serial.print("url before replace: ");
  Serial.print(url);
  Serial.print("\n");

  url.replace('-', '#');
  Serial.print("url after replace: ");
  Serial.print(url);
  Serial.print("\n");

  Serial.print("URL length: ");
  Serial.println(url.length());
  
  pos = url.length();
  pos = (pos - 9);
  
  substring = url.substring(6, pos);
  newstring = url.substring(6, 30);

  Serial.print("substring: [");
  Serial.print(substring);  Serial.print("]\n");
  Serial.print("newstring: [");
  Serial.print(newstring);  Serial.print("]\n");

}
void loop() {}

Final tweek for now. Time to go workout. Edit: Skip the workout, its Miller time!

#include "WString.h"

void setup()
{
  Serial.begin(9600);

 #define maxLength 100
 String url = String(maxLength);
 String teststring = String(maxLength);
 String finalstring = String(maxLength);

 int ind1 = 0;
 int ind2 = 0;
 int pos = 0;

  url.append("GET /?-0p1500t1000-1p2200s1000-2p1500-3p500 HTTP/1.1");
  Serial.print("url before replace: ");
  Serial.print(url);
  Serial.print("\n");

  url.replace('-', '#');
  Serial.print("url after replace: ");
  Serial.print(url);
  Serial.print("\n");

  Serial.print("URL length: ");
  Serial.println(url.length());
  
  pos = url.length();
  
  ind1 = url.indexOf('#');
  Serial.print("location of first #: ");
  Serial.print(ind1); Serial.print("\n");
  
  teststring = url.substring(ind1, pos);
  Serial.print("intermediate teststring: "); Serial.print("\n");
  Serial.print(teststring);  Serial.print("\n");
  
  ind2 = teststring.indexOf(' ');
  Serial.print("location of space: ");
  Serial.print(ind2); Serial.print("\n");

  finalstring = url.substring(ind1, ind2+ind1);

  Serial.print("finalstring: ");
  Serial.print(finalstring);  Serial.print("\n");

}
void loop() {}

The latest version of the String library (0.9) fixes the clear bug, so you shouldn't need to patch it anymore.