String object with checksum

For some communication purpose I would like to have simple checksum for my transferring data. I am using ESP8266 HTTPClient, data respectively located in String (object). What is the simplest way to calculate checksum of the data part of String (after filename and question mark)?

I have done something similar long time ago, but it was EthernetClient based on Stream object (not String), so I have created my own class, inherited from EthernetClient, and overwrite EthernetClient::write() method, adding there needed functionality, see my sample below. But I cannot get it, how to do similar thing with String…

class ExtEthernetClient:public EthernetClient
{
public:
  virtual size_t write(const uint8_t *buf, size_t size)
    {
     size_t res = EthernetClient::write(buf, size); //call base class write(..)
     if (res != 0)
       {
       _bytessent += size;
       for (uint8_t i = 0; i < size; i++) 
         {
            _checksum += buf[i];
         }
       }
     return res;
    }
  unsigned long BytesSent(void) {return _bytessent;}
  unsigned short CheckSum(void) {return _checksum;}
  void ResetBytesSent(void) {_bytessent=0;}
  void ResetCheckSum(void) {_checksum=0;}
private:
  unsigned long _bytessent; 
  unsigned short _checksum;
};

The ESP can run freeRTOS FreeRTOS API categories which has several types of stream objects; as an FYI.

Idahowalker:
The ESP can run freeRTOS FreeRTOS API categories which has several types of stream objects; as an FYI.

Wow, this is so dramatic change, the whole project is working already, I do not want to change whole platform, I would like just to protect transferred data with checksum. But thank you anyway, will consider it for future projects.

You can call c_str() on your String object and pass that to your function.

LightuC:
You can call c_str() on your String object and pass that to your function.

I did not really understand your advice. What my function?
I have String which assembles string like this: "http://host/filename/?S1=50.55&S2=124&S12=628".
"Host" and "filename" may have different length. I would like to calculate checksum for part of this string, starting from "?". I understand I can use String.indexof() to find position of question mark ((one loop)) and after that iterate from this position up to the end of whole string to calculate checksum (another loop), but I am looking for some more "elegant" solution, like to calculate checksum "on-the-fly" during initial assembling of my string.

there are tons of way to calculate a checksum. from the simplest to sum the bytes of your String in one byte (so you get a 1 byte checksum) to using more complicated algorithms on multiple bytes.

On the ESP platform, there is an embedded MD5 library. The MD5 algorithm was used as hash function producing a 128-bit value and was initially designed to be used as a cryptographic hash function. But because it's pretty weak and vulnerable we don't use that anymore but it can still be used as a checksum to verify data integrity against unintentional corruption.

J-M-L:
there are tons of way to calculate a checksum. from the simplest to sum the bytes of your String in one byte (so you get a 1 byte checksum) to using more complicated algorithms on multiple bytes.

On the ESP platform, there is an embedded MD5 library. The MD5 algorithm was used as hash function producing a 128-bit value and was initially designed to be used as a cryptographic hash function. But because it’s pretty weak and vulnerable we don’t use that anymore but it can still be used as a checksum to verify data integrity against unintentional corruption.

May be I did not explain what I need correctly… I am not asking HOW to calculate checksum, I know this, simple checksum in one byte or in one word is OK for me. I am asking WHERE is the best place to do so to avoid extra loops and extra complexity of the code.

OK indeed you asked

What is the simplest way to calculate checksum of the data part of String

so was unclear that the question was more where to put the code...

I'd say do it when you build the "network frame" to send the String... that's where you need it.

J-M-L:
OK indeed you askedso was unclear that the question was more where to put the code…

I’d say do it when you build the “network frame” to send the String… that’s where you need it.

Here is my function returning that String

String combineREQString()
{
  String REQString = "http://" + String(NASServer) + NASServerPage + "?";
  //checksum accumulation should starts here
  for (byte i=0; i<SensorsNum; i++)
    {
    REQString += "&SID";
    REQString += SID_arr[i];
    REQString += "=";
    REQString += stat[i].average(); REQString += ",";
    REQString += stat[i].minimum(); REQString += ",";
    REQString += stat[i].maximum(); REQString += ",";
    REQString += stat[i].pop_stdev();
    }
  REQString += "&HUBID="; REQString += SensorHubID;
  REQString += "&CNT="; REQString += SentCNT;
  REQString += "&PWD="; REQString += Password;
  //checksum accumulation should stops here
  return REQString;
}

It returns string like “http://some_host/somepage.php?&SID12=27.25,27.25,27.25,0.00&SID17=139.28,138.40,139.60,0.68&SID18=29.29,27.90,29.50,0.38&HUBID=2&CNT=2672&PWD=somepass”.

I would like to calculate checksum from “?” up to the end. My idea is to inhered my own class from String and add this functionality there, similar to what I did before for EthernetClient, see my sample above. Doing so I can calculate checksum “on the fly”, avoid extra search of question mark and extra looping for checksum. But I cannot understand what method of String should I overwrite to inject my code…

there are so many ways to build and extend a String. The '+' operator is overloaded would be an obvious start but you can do many other things to a String..

but if you don't add just a "?" - you would have for every '+' to scan the cString to see if there is a question mark somewhere... that sounds totally sub-optimal.

if you really want to subclass - You could envision adding a method that return a new string with the Checksum or something... but do that the lazy way, searching for the '?' only on demand.

J-M-L:
there are so many ways to build and extend a String. The '+' operator is overloaded would be an obvious start but you can do many other things to a String..

Ok, good thought, have never overload "+" operator, but will look into itm thank you...

J-M-L:
but if you don't add just a "?" - you would have for every '+' to scan the cString to see if there is a question mark somewhere... that sounds totally sub-optimal.

This one I did not get, can you please explain? My idea was NOT to scan string, but to reset checksum private variable (another simple method) right after adding "?" and get checksum (one more simple method) after assembling all string.

SergeS:
This one I did not get, can you please explain? My idea was NOT to scan string, but to reset checksum private variable (another simple method) right after adding "?" and get checksum (one more simple method) after assembling all string.

if you create a class it has to be generic and matching all use cases.

At the moment you doString REQString = "http://" + String(NASServer) + NASServerPage + "?";So you build up your String steps by steps and at one point you ask to extend the String with 1 character which is '?'.
In that case it's pretty easy to recognize in the '+' method as you don't have to parse the extension to find the '?.

But what happens if you were to know that NASServerPage is "/foobar" and you were doingString REQString = "http://" + String(NASServer) + "/foobar?";
the extension string is "/foobar?" and thus finding the '?' requires parsing. (similarly the '?' could be anywhere, not always as the start or end character).

--> it means you need to parse the data at every '+' operation to search for the '?'... that seems to be totally inefficient (as the whole String class and concatenation this way)

J-M-L:
→ it means you need to parse the data at every ‘+’ operation to search for the ‘?’…

Not at all! You have missed the point. Here is a sample, how I did it before, (ExtEthernetClient is my class inherited from EthernetClient, see my first post in this topic), there is any search. Look at the lines with “//<<<<”. I am trying to use similar approach with String.

ExtEthernetClient client;

void sendREQ() 
{
  unsigned long bytessent, checksum;
  if (client.connect(NASServer, 80)) 
  {
   client.print(F("GET ")); client.print(NASServerPage); client.print(F("?")); 


  client.ResetBytesSent(); client.ResetCheckSum();   //<<<<  Look here... Reset checksum accumulation.


  for (byte i=0; i<SensorsNum; i++)
    {
    client.print(F("&SID"));
    client.print(SID_arr[i]);
    client.print(F("="));
    client.print(stat[i].average());   client.print(F(","));
    client.print(stat[i].minimum());   client.print(F(","));
    client.print(stat[i].maximum());   client.print(F(","));
    client.print(stat[i].pop_stdev()); 
    }
  client.print(F("&HUBID=")); client.print(SensorHubID); 
  client.print(F("&CNT=")); client.print(SentCNT); 
  client.print(F("&PWD=")); client.print(Password); 


  bytessent = client.BytesSent(); checksum = client.CheckSum();   //<<<< ... аnd here! Checksum is ready.

  
  client.println(F(" HTTP/1.0")); 
  client.println(F("Host: SomeHost"));
  client.println();
  ........

I’m confused.

here you are subclassing the network classes, which makes sense because this is where you actually know what is being sent and you are explicitly doing a client.ResetBytesSent(); client.ResetCheckSum();

Do you want to create a SubClass of the String class (CkSumString) and do something like this

  CkSumString REQString = "http://" + String(NASServer) + NASServerPage + "?";
  REQString.ResetCheckSum();

  for (byte i=0; i<SensorsNum; i++) {
    REQString += "&SID";
    REQString += SID_arr[i];
    REQString += "=";
    REQString += stat[i].average(); REQString += ",";
    REQString += stat[i].minimum(); REQString += ",";
    REQString += stat[i].maximum(); REQString += ",";
    REQString += stat[i].pop_stdev();
  }
  REQString += "&HUBID="; REQString += SensorHubID;
  REQString += "&CNT="; REQString += SentCNT;
  REQString += "&PWD="; REQString += Password;

  checksum = REQString.CheckSum();

or do you want the CkSumString to auto-detect the ‘?’

the String class is so costly (moves things around in memory, reallocate on the heap) that finding the ‘?’ and calculating the CRC at the end would not be very costly in comparison… (basically you have 2 search for the ‘?’ if you don’t pin the position with a REQString.ResetCheckSum();…) the CkSum cost is the same wether you do it at the end or on the go.

also look at the class source code an all the ways to create and extend a String

	unsigned char concat(const String &str);
	unsigned char concat(const char *cstr);
	unsigned char concat(char c);
	unsigned char concat(unsigned char c);
	unsigned char concat(int num);
	unsigned char concat(unsigned int num);
	unsigned char concat(long num);
	unsigned char concat(unsigned long num);
	unsigned char concat(float num);
	unsigned char concat(double num);
	unsigned char concat(const __FlashStringHelper * str);

	// if there's not enough memory for the concatenated value, the string
	// will be left unchanged (but this isn't signalled in any way)
	String & operator += (const String &rhs)	{concat(rhs); return (*this);}
	String & operator += (const char *cstr)		{concat(cstr); return (*this);}
	String & operator += (char c)			{concat(c); return (*this);}
	String & operator += (unsigned char num)		{concat(num); return (*this);}
	String & operator += (int num)			{concat(num); return (*this);}
	String & operator += (unsigned int num)		{concat(num); return (*this);}
	String & operator += (long num)			{concat(num); return (*this);}
	String & operator += (unsigned long num)	{concat(num); return (*this);}
	String & operator += (float num)		{concat(num); return (*this);}
	String & operator += (double num)		{concat(num); return (*this);}
	String & operator += (const __FlashStringHelper *str){concat(str); return (*this);}

	friend StringSumHelper & operator + (const StringSumHelper &lhs, const String &rhs);
	friend StringSumHelper & operator + (const StringSumHelper &lhs, const char *cstr);
	friend StringSumHelper & operator + (const StringSumHelper &lhs, char c);
	friend StringSumHelper & operator + (const StringSumHelper &lhs, unsigned char num);
	friend StringSumHelper & operator + (const StringSumHelper &lhs, int num);
	friend StringSumHelper & operator + (const StringSumHelper &lhs, unsigned int num);
	friend StringSumHelper & operator + (const StringSumHelper &lhs, long num);
	friend StringSumHelper & operator + (const StringSumHelper &lhs, unsigned long num);
	friend StringSumHelper & operator + (const StringSumHelper &lhs, float num);
	friend StringSumHelper & operator + (const StringSumHelper &lhs, double num);
	friend StringSumHelper & operator + (const StringSumHelper &lhs, const __FlashStringHelper *rhs);

it would be quite a massive amount of code for little gain.

J-M-L:
here you are subclassing the network classes, which makes sense because this is where you actually know what is being sent and you are explicitly doing a client.ResetBytesSent(); client.ResetCheckSum();

Yep!

J-M-L:
Do you want to create a SubClass of the String class (CkSumString) and do something like this [...] or do you want the CkSumString to auto-detect the '?'

No, I do not want autodetect, I want to do like this! Btw, it could be more universal, this subclass may be used in some other cases,
where checksum may be needed in conjunction with String.

J-M-L:
the String class is so costly (moves things around in memory, reallocate on the heap) that finding the '?' and calculating the CRC at the end would not be very costly in comparison... (basically you have 2 search for the '?' if you don't pin the position with a REQString.ResetCheckSum();..) the CkSum cost is the same wether you do it at the end or on the go.

May be you are right, but it is not elegant solution :slight_smile:

SergeS:
May be you are right, but it is not elegant solution :slight_smile:

well - here is an high level demonstration of what it would take to play around and subclass the String class to add a checksum automatic calculation "every time" you modify the URL.

class CheckSumURLString : public String
{
  public:
    // constructor (redifining only a subset)
    CheckSumURLString(const char* cstr) : String(cstr), _checkSum(-1) {
      computeCheckSum();
    }


    // the assignment operator CANNOT be inherited (redifining only a subset)
    // with a String as parameter
    CheckSumURLString& operator = (const String &rhs)
    {
      CheckSumURLString& res = (CheckSumURLString&) String::operator = (rhs);
      computeCheckSum();
      return res;
    }

    // with a const char* as parameter
    CheckSumURLString& operator = (const char* cstr)
    {
      CheckSumURLString& res = (CheckSumURLString&) String ::operator = (cstr);
      computeCheckSum();
      return res;
    }

    // redifining the += operator (only for a subset)

    CheckSumURLString& operator += (const char *cstr)
    {
      String::operator +=(cstr);
      computeCheckSum();
      return (*this);
    }


    uint8_t getCheckSum()
    {
      Serial.println(buffer);
      if (_checkSum == -1) {
        Serial.println(F("No checksum, missing '?'"));
      } else {
        Serial.print(F("Checksum="));
        Serial.println(_checkSum);
      }
      Serial.println();
      return _checkSum;
    }

  private:
    int _checkSum;

    void computeCheckSum()
    {
      const char* ptr = strchr(buffer, '?');
      uint8_t tempCheckSum = 0;

      _checkSum = 0;
      if (ptr++) {
        while (*ptr) tempCheckSum += *ptr++; // compute on 1 uint8_t
        _checkSum = tempCheckSum; // promote to int, only LSB used
      }
      else _checkSum = -1; // -1 to denote invalid check sum
    }
};

CheckSumURLString foo = "http://mydomain.fr/"; // this calls the constructor with a const char *

void setup()
{
  Serial.begin(115200);

  foo.getCheckSum(); // will be -1 (error) as there is no '?'

  foo += "helloWorld?"; // calls the overloaded += operator with a const char * and update check sum
  foo.getCheckSum(); // will be 0 as there is nothing after the '?'

  foo += "x=1";  // calls the overloaded += operator with a const char * and update check sum
  foo.getCheckSum();

  foo += "&y=2";  // calls the overloaded += operator with a const char *
  foo.getCheckSum();
}

void loop() {}

if you run it you'll see in the console (at 115200 bauds) the following :

http://mydomain.fr/
No checksum, missing '?'

http://mydomain.fr/helloWorld?
Checksum=0

http://mydomain.fr/helloWorld?x=1
Checksum=230

http://mydomain.fr/helloWorld?x=1&y=2
Checksum=244

Note1: I say "every time" but it's not really the case. I would have to implement all the possible variations of operator overload +, +=, = and the constructors and Direct initialization constructor or copy/move constructors to ensure I recalculate the checksum when the String changes... As you can see it can become tedious.

Note2: the assignment operator CANNOT be inherited so you'll have to redefine those even if you only wanted to add a method to calculate the checksum.

--> IMHO this is not worth the effort of subclassing, just create a general function that will take a String reference as parameter and calculate the checkSum... Much simpler = more elegant :slight_smile:

J-M-L:
→ IMHO this is not worth the effort of subclassing, just create a general function that will take a String reference as parameter and calculate the checkSum… Much simpler = more elegant :slight_smile:

Yes, Seems like to be too complicated, probably - you are right, simple function with .indexof("?") and for-foop will do the trick. I have no experience with overloading operators, so thank you for help.

SergeS:
Yes, Seems like to be too complicated, probably - you are right, simple function with .indexof("?") and for-foop will do the trick. I have no experience with overloading operators, so thank you for help.

for efficiency, I would not use the class methods (indexOf, charAt(), ..), I would write the function accessing the underlying storage buffer this way:

int computeCheckSum(String& s)
{
  int _checkSum = -1; // undefined by default
  uint8_t tempCheckSum = 0;
  const char * doNotMessWithThisbuffer = s.c_str();
  const char* ptr = strchr(doNotMessWithThisbuffer, '?');

  if (ptr++) {
    while (*ptr) tempCheckSum += *ptr++; // compute on a uint8_t
    _checkSum = tempCheckSum; // promote to int, only LSB used
  }
  return (_checkSum);
}

String url = "http://mydomain.fr/helloWorld?x=1";

void setup()
{
  Serial.begin(115200);
  Serial.println(computeCheckSum(url));
  Serial.println(computeCheckSum(url+"&y=2"));
}

void loop() {}

if you run it you'll see in the console (at 115200 bauds) the following :

230
244

which matches the result above

and if I had to do it myself, I would just completely get rid of the String class and use cStrings...

J-M-L:

_checkSum = tempCheckSum; // promote to int, only LSB used

I did not get why to play with two different types for checksum? What is the logic behind?

J-M-L:
and if I had to do it myself, I would just completely get rid of the String class and use cStrings...

I am using HTTPClient, i did not see an other way how to pass request to server, recalling from my memory (cannot look now) there is only .get method with String parameter type. But need to review, may be I am wrong. Why so prejudiced against Strings? Seems like it is working well on ESP8266, although on Arduino UNO i had a lot of problem with it.

SergeS:
I did not get why to play with two different types for checksum? What is the logic behind?

I want to be able to return -1 (a signed number) when the checksum can't be computed. So _checkSum is a signed int (16 bits) whereas I want to computer the checksum using the rollover capability of unsigned byte which is the type of tempCheckSum.

(It's like the read() function in a stream. It returns actually an int, not a byte. That int will be -1 if read() failed otherwise the Most Significant byte is empty and the byte you received is in the LSB).

SergeS:
I am using HTTPClient, i did not see an other way how to pass request to server, recalling from my memory (cannot look now) there is only .get method with String parameter type. But need to review, may be I am wrong. Why so prejudiced against Strings? Seems like it is working well on ESP8266, although on Arduino UNO i had a lot of problem with it.

I think you are right at their API supports only the String class. it's not that it's OK on an ESP, you potentially still have the same issues of fragmenting the heap - just that because you have way more memory, the problem is pushed to later...