String numbers separation to floats - Best way

Hello there everyone,

I have a string that looks as follows:
XYZ values - X: 35.3 Y: 446 Z:10.5
or
A: 5.321 B: 446 C: 10.553 D: 45000

And I need to separate the extract from the string only the numbers, which can be integers or floats.

Inspired by this: Split Strings based on space
This is the approach I am currently using, which works well, but I would like to move away from String arrays to either char arrays or any other method that is more optimised.

// first element passed is the string that needs to be separated
// second element is the array where to store the values
void StringSeparator(String str, String strs[]) {                                    
  memset(strs, 0, sizeof(strs));                                                                             
  uint8_t StringCount = 0;                                                                                                 

  while (str.length() > 0) {
    int8_t index = str.indexOf(' ');
    if (index == -1) {                                                                             
        strs[StringCount++] = str;
        break;
    } else {
        if (isDigit(str.charAt(StringCount+0))) {strs[StringCount++] = str.substring(0, index);}
        str = str.substring(index + 1);
    }
  }
}

I don't have that much experience with Strings in C/C++, never really had to use them until now, and I know they are not really a good way to do things. Please keep in mind the string I receive is from a source that I cannot modify/change, so the string is the one showed at the top.

The solution should be "universal", as in I can have floats or ints, maximum 6 digits (355.125 format for example), but the amount of numbers that need to be extracted can vary, it can be 3, 4 etc.

Any help is highly appreacited!

use indexOf() to find "X:" and use toFloat() or toDouble() on the substring starting at the index +2 (+2 to skip the "X:"and be on the first digit or the space

repeat for Y: and Z:

void setup() {
  Serial.begin(115200); Serial.println();
  String message = "XYZ values - X: 35.3 Y: 44 Z:10.5";
  double x = message.substring(message.indexOf("X:") + 2).toDouble();
  double y = message.substring(message.indexOf("Y:") + 2).toDouble();
  double z = message.substring(message.indexOf("Z:") + 2).toDouble();
  Serial.print("X = "); Serial.println(x,6);
  Serial.print("Y = "); Serial.println(y,6);
  Serial.print("Z = "); Serial.println(z,6);
}

void loop() {}

using the String class is NOT the recommended way

Wouldn't using SubString defeat the purpose of the function?
As my number can be 1.352 or 356.321 or 57543.

I understand that with the +2 you are moving to the first digit, but how does "it know" when to stop the digit extraction? Does it stop automatically as soon as it encounters another character because of the toDouble?

Indeed, that is why I am asking. I cannot change the string, it is not "made by me" the same way it is in your example code, I receive it and I have to process it.

the +2 is to skip the 2 characters "X:" so it's always +2

try the proposed code with various data, it should work as long as the format is respected

but it's your decision to store that in a String with a capital S versus a cString

If you receive the data in the form of a C-string (zero terminated character array), there there is an entire collection of standard functions that allow you to search, break up and interpret any part of that string.

For example, strtok() breaks up a string into individual components, separated by commas, blanks, etc.

They are in the standard library string.h: https://www.cplusplus.com/reference/cstring/

I have modified my reply after I realised exactly what you said.

I am not storing it in the String.
I have a library from Bosch that returns a set of values as a String.
So I use in code:

bosch_sensor.toString();      //this is how it has been implemented in the library, I cannot change it

And if I do Serial.println(bosch_sensor.toString()); it prints XYZ values - X: 35.3 Y: 446 Z:10.5

I am not actually doing: String sensor_output = bosch_sensor.toString();

EDIT: This is the library implementation of the String return:

struct DataXYZ {
  int16_t x;
  int16_t y;
  int16_t z;

  String toString() {
    return (String)("XYZ values - X: " + String(x)
                    + "   Y: " + String(y)
                    + "   Z: " + String(z) + "\n");
  }
};

Is there a more sensible, more efficient and less wasteful routine in the library, that is, one that returns actual values?

Sadly there is none. I either have to edit the library to have everything return with &x &y &z which would break at the next update and I would have to provide a modified library with my code.

Or just use the way they implemented it and split the string in my code.

That is really hard to believe. Such a stupid choice! Please post a link to the library.

strtof - C++ Reference (cplusplus.com)

@jremington The library is the BHY2 available on GitHub here: GitHub - arduino-libraries/Arduino_BHY2: Mirror of https://github.com/arduino/nicla-sense-me-fw , please post any issue there

EDIT: Check Arduino_BHY2/src/sensors /

File: DataParser.h

I can see there is no other way of returning the actual values.
Some other sensors have the method temperature.value() which returns the actual float value. "temperature" is the name I gave to the sensor.

Any return/output that has more than 1 output, so for example an IMU compared to a temp sensor, is returned as a String, in order to include all outputs as one.

@runaway_pancake How does that help? Your method only works with cStrings, and mine is different.

The actual data are stored in a struct associated with each sensor, as clearly demonstrated by this debugging code from the library. But I would certainly not waste my time working with that can of worms.

(I gave up on most Bosch sensors long ago, as they intended for black box consumer devices, and are not keeping up with the accuracy and precision of sensors available from other manufacturers).

void BoschParser::parseData(const struct bhy2_fifo_parse_data_info *fifoData, void *arg)
{
  SensorLongDataPacket sensorData;
  sensorData.sensorId = fifoData->sensor_id;
  sensorData.size = (fifoData->data_size > sizeof(sensorData.data)) ? sizeof(sensorData.data) : fifoData->data_size;
  memcpy(&sensorData.data, fifoData->data_ptr, sensorData.size);

  if (_debug) {
    _debug->print("Sensor: ");
    _debug->print(sensorData.sensorId);
    _debug->print(" size: ");
    _debug->print(sensorData.size);
    _debug->print("  value: ");
    for (uint8_t i = 0; i < (sensorData.size - 1); i++)
    {
        _debug->print(sensorData.data[i], HEX);
        _debug->print(" ");
    }
    _debug->print("  ");
  }

do you have a link to that library?

EDIT - sorry browser did not update the whole conversation that happened in between my intent to answer and actual post

@J-M-L

See post #12.

EDIT - sorry browser did not update the whole conversation that happened in between my intent to answer and actual post

Here

Edit: oops everything loaded after

Edit 2: While I agree the library may not have used the best implementation on returning the values. And as I would prefer not to modify the library which would get wiped with an update, I would like to return to the topic of the post for the best way to split the string while using least amount of memory and no Strings.

seems there is a way to get x y and z

Arduino_BHY2/SensorXYZ.h at 938010dcd9c51d4ab3c1c56acc8fec96185ad8d3 · arduino-libraries/Arduino_BHY2 · GitHub

  int16_t x() 
  { 
    return _data.x; 
  }
  int16_t y()
  {
    return _data.y;
  }
  int16_t z()
  {
    return _data.z;
  }

This does not help.

The X,Y,Z was just an example.

I am not actually using the X,Y,Z at all, but rather the BSEC and Orientation outputs. Which are also as string.

Here:

struct DataOrientation {
  float heading;
  float pitch;
  float roll;

  String toString() {
    return (String)("Orientation values - heading: " + String(heading, 3)
                    + "   pitch: " + String(pitch, 3)
                    + "   roll: " + String(roll, 3) + "\n");
  }
};

And here:

struct DataBSEC {
  uint16_t  iaq;         //iaq value for regular use case
  uint16_t  iaq_s;       //iaq value for stationary use cases
  float     b_voc_eq;    //breath VOC equivalent (ppm)
  uint32_t  co2_eq;      //CO2 equivalent (ppm) [400,]
  float     comp_t;      //compensated temperature (celcius)
  float     comp_h;      //compensated humidity
  uint32_t  comp_g;      //compensated gas resistance (Ohms)
  uint8_t   accuracy;    //accuracy level: [0-3]

  String toString() {
    return (String)("BSEC output values - iaq: " + String(iaq)
                    + "   iaq_s: " + String(iaq_s)
                    + "   b_voc_eq: " + String(b_voc_eq, 2)
                    + "   co2_eq: " + String(co2_eq)
                    + "   accuracy: " + String(accuracy)
                    + "   comp_t: " + String(comp_t, 2)
                    + "   comp_h: " + String(comp_h, 2)
                    + "   comp_g: " + String(comp_g)
                    + "\n");
  }
};

I can't believe a library would only offer a String as the interface to see the values of the sensors....

You need to look at the API to get the right values, for example the SensorOrientation class has this

1 Like

Actually. I took a deeper look and also found:

uint16_t iaq() {return _data.iaq;}
  uint16_t iaq_s() {return _data.iaq_s;}
  float b_voc_eq() {return _data.b_voc_eq;}
  uint32_t co2_eq() {return _data.co2_eq;}
  uint8_t accuracy() {return _data.accuracy;}
  float comp_t() {return _data.comp_t;}
  float comp_h() {return _data.comp_h;}
  uint32_t comp_g() {return _data.comp_g;}

Which may work. Although is not part of the documentation available at: https://docs.arduino.cc/tutorials/nicla-sense-me/cheat-sheet#sensor-ids