Extracting substring of variable length

Greetings,

i am looking for a solution to reliably extract a phone number of variable length from a String.

The String looks like this, with the part i need to extract being +8613920001234 :

+CMGR: "REC UNREAD","+8613920001234","","12/12/13,15:08:55+50"

What i have tried so far is this quite nasty piece, which works kind of but only when specifying the length of the number i need:

(sms.substring(sms.indexOf("READ\",\"")+7,sms.indexOf("READ\",\"")+21)).toCharArray(phonenumber_new,20);

I guess whats needed is to somehow extract everything between READ"," and the following ",. I would be most grateful for any insights or approaches.

Peace and have a good day.

Does this help http://www.cplusplus.com/reference/string/string/find/

Does this help

Not when using a String. Which OP should not be.

Easy:

  1. zero the target string;
  2. search for '+' til the end of the source string;
  3. from there, copy to '"', or end of the source string.
  4. done.

PaulS:

Does this help

Not when using a String. Which OP should not be.

Ah, so String is based on string. I assumed they were different as peoples have highlighted an OP's post asking when they say string did they mean String or string.

Hi there - the current source string is a String (capital S), the one i should not be using according to PaulS :frowning:
I'll try to follow what dhenry said, but especially the "search for '+' til the end of the source string;" is something i am not sure about how to do yet.

Ah, so String is based on string.

A String object wraps a NULL terminated char array. That is not quite the same thing as being based on a string.

Typically, when we ask of the OP means string or String, it is because they say one and then post some code that uses the other, or they don't post any/adequate code to make it possible to determine.

The correct way to parse those lines is to parse them!

That means you need to count ',' separators (but ignore them when inside string quotes). This will get you to the correct
column, and you can extract the string-quote delimited string from there.

Unfortunately proper parsing is often quite complex (string quotes can be escaped inside a string for instance) so
on a tiny little microcontroller its tempting to take shortcuts... Can come back to bite you later though.

a quick hack to extract with 2 pointers, one to find the start and one to find the end of the string

char s[] = "+CMGR: \"REC UNREAD\",\"+8613920001234\",\"\",\"12/12/13,15:08:55+50\""; // note that the \" are escaped with a \\..
char t[20];

void setup()
{
  int len = strip(s, t);
  
  Serial.begin(9600);
  Serial.println("start");
  Serial.println(len);
  Serial.print(t);
}

void loop()
{}


// dedicated function to strip second field which is ... not so elegant ...
int strip(char * in, char * out)
{
  char *p = in;
  // find start of second field
  while (',' != *p++);
  if (*p =='\"') p++;
  // find endpoint
  char *q = p+1;
  while ('\"' != *q++);  // should also test for end of string ... first ...
  q--;  //we know we have gone 1 byte to far 
  int len = q - p;  // warning pointer math
  strncpy(out, p, len);
  return len;
}

The out variable is never used... 8)

oops, copied too fast .... will fix it ..

... done ...

is there any way to do what i wanted when using a String at all? I mean extract a substring of variable length from a String, not from a string[]

is there any way to do what i wanted when using a String at all? I mean extract a substring of variable length from a String, not from a string[]

Yes. The methodology is exactly the same. Find the start position. Find the end position. Extract the data in between.

I recommend steel toed shoes while you shoot yourself in the foot. Less damage to your foot that way.

Below has the various methods to slice/dice a String. What method you use depends on how the string is constructed and how consistant the data is inside the String. The number starts after the second + in the string, so that might be a start locator for capturing the number string. Probably several interesting ways to solve the issue. The substring function might be of interest.

+CMGR: "REC UNREAD","+8613920001234","","12/12/13,15:08:55+50"

readString1 = (readString.substring(20,43));

As part of a larger project I'm writing a "string" object (note "object" not "class", it's all C not C++) and one of the methods extracts a field of variable length at a variable location. You just supply the delimiters.

So I thought I'd plug your string into it and see if it worked.

	string * str;
	char my_char_array [100];
	char * nmea  = "+CMGR: \"REC UNREAD\",\"+8613920001234\",\"\",\"12/12/13,15:08:55+50\"";

	str = stringCreate(100);
	stringLoadFromArray(str, nmea);
	stringGetField(str, 2, "+\"", my_char_array);
	printf("%s\n", my_char_array);					// prints 8613920001234

It does.

All of what I'm doing is for an ARM but that said this sort of function is just C so should run on anything and I plan to port it across to the Arduino one day.


Rob

	printf("%s\n", s);					// prints 8613920001234

Where the heck did s come from?

Oops, my bad. Tried to clean the snippet up and only did half the job.

Fixed.


Rob

Simple extraction of the phone number of the below string (copy string, paste in the serial monitor and send to the arduino to test).

+CMGR: "REC UNREAD","+8613920001234","","12/12/13,15:08:55+50"

// zoomkat 7-30-11 serial I/O string test
// type a string in serial monitor. then send or enter
// for IDE 0019 and later

String readString;
String finalstring;

void setup() {
  Serial.begin(9600);
  Serial.println("serial test 0021"); // so I can keep track of what is loaded
}

void loop() {

  while (Serial.available()) {
    delay(2);  //delay to allow byte to arrive in input buffer
    char c = Serial.read();
    readString += c;
  }

  if (readString.length() >0) {
    Serial.println(readString);
    finalstring = readString.substring(21, 35);
   Serial.println(finalstring); 

    readString="";
    finalstring=""; 
  } 
}