char to String to char conversion

Hi there.

I've been doing some string manipulation using the Ethernet shield and TextFinder for output onto an 8x8 RGB Matrix.

I can reliably pull an RSS news item like this:

  if (client.connected()) {   

    strcpy_P(buffer, (char*)pgm_read_word(&(string_table[14]))); // "title data="

    finder.find(buffer);

    for (int i = 0; i < randomInt; i++)  {
      if (finder.getString(buffer,"\"", newsString, 84))  {
      }
    }

The feed displays just fine on the matrix.

But the news item is not clean text. It occasionally has "smart quotes" in it. These appear in the feed as "&amp;#39;" in the displayed feed.

I want to replace the ugliness with a simple quote, '.

So, after searching high and low for a reliable replace function, I stumbled on the one right under my nose: String.replace.

However, the feed is dumped into char newsString[]. In order to use .replace, I need a String, so:

  // Make a new string.
  String d = String(85);

  //Replace the HTML nastiness with a single '
  d = String(newsString).replace("&amp;amp39;", "\'");

But then in order to display the information, I need to have a char again. So:

    d.toCharArray(newsString, 84);

Kaboom. The board freezes or resets itself.

Inserting debug statements shows that the replace is actually reached.

It feels like I'm misusing or incorrectly calling the .replace or handling the array badly.

Can anyone point me in the right direction?

Cheers,

Sounds like you are running out of memory to me,
please post the whole code so it can be confirmed.

To resolve your problem, we'd need to see more of your code. How is newsString defined? Is it a char array with 85 or more elements? How is buffer defined?

String d = String(85);

This creates a String object, d, equal to the String object that contains a character representation of the value 85, not one that contains room for 85 characters.

PaulS:
To resolve your problem, we'd need to see more of your code. How is newsString defined? Is it a char array with 85 or more elements? How is buffer defined?

They're both globals.

char buffer[85]; // make sure this is large enough for the largest string it must hold
char newsString[85]; // make sure this is large enough for the largest string it must hold

String d = String(85);
This creates a String object, d, equal to the String object that contains a character representation of the value 85, not one that contains room for 85 characters.

Aha. That'll be the problem.

So, what is the correct way to create a String that is capable of holding the contents of newsString[]?

So, what is the correct way to create a String that is capable of holding the contents of newsString[]?

I doubt you will be able to capture the total feed to a string unless it is very small. You can capture the feed characters in a string and then send the string to the serial monitor to see just how much of the feed was captured.

zoomkat:

So, what is the correct way to create a String that is capable of holding the contents of newsString[]?

I doubt you will be able to capture the total feed to a string unless it is very small. You can capture the feed characters in a string and then send the string to the serial monitor to see just how much of the feed was captured.

I think this may be a distraction. The existing code, prior to the .replace call works reliably and has done for a couple of weeks. It's the .replace call that's the issue here.

As previously quoted, this bit of code:

if (finder.getString(buffer,"\"", newsString, 84))  {

uses the TextFinder.getstring call to stuff up to 84 characters of the current feed from the text contained in buffer[], up to the quote mark that's at the other end of the code.

In brief, the existing code works.

The issue now is that I want to pass the resulting char newsString[] to String.replace, replace the string, then convert it back to a char[] and I'm clearly not calling String correctly in the process.

  d = String(newsString).replace("&amp;amp39;", "\'");

You could be running out of memory. String(newsString) makes a copy of the data in newsString. So, now, you have this unnamed copy, newsString, and buffer all containing the same up-to-84 characters, while the finder object also contains a copy of the same data.

With the Client object, the Ethernet object, etc. also using memory, you may be using more than there is. Since you can't/won't post all of you code, search for "FreeMemory" on the forum. There is a function you can add to your sketch that will tell you, at certain points in your code, how much SRAM you still have available. I'm guessing it's not much/enough.

Paul,

You wrote:

String d = String(85);

This creates a String object, d, equal to the String object that contains a character representation of the value 85, not one that contains room for 85 characters.

Aha. That'll be the problem.

So, what is the correct way to create a String that is capable of holding the contents of newsString[]?

It's definitely possible that I'm running out of RAM. I'm already using techniques to move the static strings into program memory.

However, given that I'm creating d in correctly, not giving it 85 characters, but rather assigning it the value 85, isn't it more likely I'm running off the end of d? And if that's the case, how do I create d with enough space for newsString[85]? The String reference page shows many cases, none of which allow you create an empty string with 85 characters. Am I really left with doing this?

String d = String("                                                                                                                                               ");

However, given that I'm creating d in correctly, not giving it 85 characters, but rather assigning it the value 85, isn't it more likely I'm running off the end of d? And if that's the case, how do I create d with enough space for newsString[85]? The String reference page shows many cases, none of which allow you create an empty string with 85 characters. Am I really left with doing this?

You are not defining d incorrectly. You are simply not defining it the way that you thought you were.

You should probably spend some time looking at the String source code. If you do, you will see that the copy operator and the assignment operator are defined for the String class, so one can assign a character array to a String object, and the String object will be automatically sized to hold the character array, whatever size the character array is, provided there is memory available to hold the String object that is to be created.

PaulS:
You are not defining d incorrectly. You are simply not defining it the way that you thought you were.

You should probably spend some time looking at the String source code. If you do, you will see that the copy operator and the assignment operator are defined for the String class, so one can assign a character array to a String object, and the String object will be automatically sized to hold the character array, whatever size the character array is, provided there is memory available to hold the String object that is to be created.

Aha. With that in mind, this revision is probably a good idea:

  //Define d to be however big it needs to be in order to hold newsString and do a find and replace at the same time.
   String d = String(newsString).replace("&amp;amp39;", "\'"); 

   //Convert d back into newsString.
    d.toCharArray(newsString, 84);

If this looks right to you, and if I'm not walking off the end of the string, then that leaves your suggestion, that I'm running out of memory, as the only plausible explanation.

So, is the code itself correct?

Think you better use an in place replace command for char[]'s,

The replace function below replaces every occurence of "from" with "to" in the source string but only if the "to" string is shorter or equal than the "from" string.

(not tested thoroughly)

// 
//    FILE: replace.pde
//  AUTHOR: Rob Tillaart
// VERSION: 0.1.00
// PURPOSE: in place replace in a char string.
//
// HISTORY: 
// 0.1.00 - 2011-05-13  initial version
// 
// Released to the public domain
//

char in[128] = "String(newsString).replace(&amp;amp39;with something else&amp;amp39;);";

void replace(char* source, char* from, char* to)
{
  uint8_t f = strlen(from);
  uint8_t t = strlen(to);
  char *p = source;

  if (t> f) return;
  while (*p != '\0')
  {
    if (strncmp(p, from, f) == 0)
    {
      strncpy(p, to, t);
      p += t;
      strcpy(p, p+f-t);
    }
    else p++;
  }
}

void setup()
{
  Serial.begin(115200);
  Serial.println("Start");
  Serial.println(in);

  unsigned long t1 = micros();
  replace(in, "&amp;amp39;", "\'");
  Serial.println(micros() - t1);

  Serial.println(in);
  replace(in, "ing", "ong");
  Serial.println(in);
}

void loop(){}

You can do it without the string library.

I made a regular expression finder recently, and with a bit of work you can make it replace as well. Basically you memmove stuff around to make room for the replacement, and then copy the replacement in.

It's all documented here:

If you don't want to follow that link, this is my example code that replaces a couple of "html entities" with something else:

#include <Regexp.h>

void setup ()
{
  Serial.begin (115200);

  // what we are searching
  char buf [100] = "I do like to be &lt;&lt; beside &gt;&gt; the seaside";

  MatchState ms;
  // set address of string to be searched
  ms.Target (buf);

  // what we will replace it with
  char replacement [20];
 
  unsigned int index = 0;
      
  while (ms.Match ("&%a+;", index) > 0)
    {
    // increment start point ready for next time
    index = ms.MatchStart + ms.MatchLength;    
   
    // see if we want to change it
    if (memcmp (&buf [ms.MatchStart], "&lt;", ms.MatchLength) == 0)
      strcpy (replacement, "<");
    else if (memcmp (&buf [ms.MatchStart], "&gt;", ms.MatchLength) == 0)
      strcpy (replacement, ">");
    else
      continue;  // nope, move along
    
    // see how much memory we need to move
    int lengthDiff = ms.MatchLength - strlen (replacement);
  
    // copy the rest of the buffer backwards/forwards to allow for the length difference
    // the +1 is to copy the null terminator
    memmove (&buf [index - lengthDiff], &buf [index], strlen (buf) - index + 1);
  
    // copy in the replacement
    memmove (&buf [ms.MatchStart], replacement, strlen (replacement));
 
    // adjust the index for the next search
    index -= lengthDiff;  
    
    } // end of while

  Serial.println (buf);
  
}  // end of setup  

void loop () {}

Since this uses "in-place" replacements, the memory requirements should not go over whatever you need to hold the original string in the first place.

@Nick
Had a quick look at your regex solution, nice feature is that it can replace with longer strings too. I'll have to reengineer my code :wink:

Rob

Thanks! Tip: memmove is your friend, compared to memcpy. That was what let me make the strings larger (I tested that but didn't demonstrate it).

Nick,

This is brilliant. I've grabbed the updated library in order to use GlobalReplace. So now, I'm doing this:

    // for matching regular expressions  
    MatchState ms (newsString);
    
    // replace the &amp with a single quote '
    ms.GlobalReplace ("&amp", "\'");

But executing the GlobalReplace causes the sketch to crash.

  1. Does the GlobalReplace method use the memcpy techniques from your suggested code above?

  2. Do I need to escape the single quote as I've done above?

Cheers,

An & is normally & not a quote, but I guess you are just testing.

What you have looks OK, you can escape the single quote but don't need to .

It shouldn't crash, this test doesn't:

#include <Regexp.h>

void setup ()
{
  Serial.begin (115200);
  Serial.println ();

  // what we are searching (the target)
  char newsString [100] = "The quick &amp; fox jumps over &amp; lazy wolf";

  // match state object
  MatchState ms (newsString);

  ms.GlobalReplace ("&amp;", "'");

  // show results
  Serial.print ("Converted string: ");
  Serial.println (newsString);

}  // end of setup  

void loop () {}

To comment further I would have to see how you have newsString defined.

It's still as I originally posted.

char newsString[85]; // make sure this is large enough for the largest string it must hold

And yes, the "&amp" should properly be "&#39;"

Cheers,

Can I see the whole thing please? Mine works, yours doesn't. So it is something else.

And yes, the "&amp" should properly be "&#39;"

Just "&" - that is what you are searching for.

Oh, and how much data is in newsString? Can you display it before the conversion so we can see?

newsString contains a news headline, selected at random, from an XML feed.

  if (client.connected()) {   
    strcpy_P(buffer, (char*)pgm_read_word(&(string_table[14]))); // "title data="
    finder.find(buffer);
    for (int i = 0; i < randomInt; i++)  {
      if (finder.getString(buffer,"\"", newsString, 84))  {
      }
    }

That's immediately followed by your example code.

Interestingly, I tried the approach suggested by robtillaart, prior to your regex solution:

void replace(char* source, char* from, char* to)
{
  uint8_t f = strlen(from);
  uint8_t t = strlen(to);
  char *p = source;

  if (t> f) return;
  while (*p != '\0')
  {
    if (strncmp(p, from, f) == 0)
    {
      strncpy(p, to, t);
      p += t;
      strcpy(p, p+f-t);
    }
    else p++;
  }
}

and it works like a charm, substituting at will. I'd still like to get the regex method working as it's a lot more versatile, I suspect.