Spike elimination parsing data

Hello. I have a problem that i am unable to solve. I believe there is something that i don't see so i am waiting for your suggestions.

I am connecting an arduino with a mega through serial and getting data. Sometimes the message (1-2 times per day) arrives "corrupted". As for example i can see the outside temperature has the barometric pressure. So i decided to drop those measurements, because at the next message it will be the correct one. Somehow it doesn't work.

My code:

void b4_parseData() {
	char * strtokIndx4;
	float S13t, S14t, S15t;
	int S16t;

	if (l < 10){
		strtokIndx4 = strtok(tempChars4,",");
		S13 = atof(strtokIndx4);

		strtokIndx4 = strtok(NULL, ",");
		S14 = atof(strtokIndx4);

		strtokIndx4 = strtok(NULL, ",");
		S15 = atof(strtokIndx4);
	
		strtokIndx4 = strtok(NULL, ",");
		S16 = atoi(strtokIndx4);
	}
	else {
		strtokIndx4 = strtok(tempChars4,",");
		S13t = atof(strtokIndx4);

		strtokIndx4 = strtok(NULL, ",");
		S14t = atof(strtokIndx4);

		strtokIndx4 = strtok(NULL, ",");
		S15t = atof(strtokIndx4);
	
		strtokIndx4 = strtok(NULL, ",");
		S16t = atoi(strtokIndx4);
		
		if(((S13t - S13) < 2.0) || ((S13 - S13t) < 2.0)){
			S13 = S13t;
		}
		
		if(((S14t - S14) < 2.0) || ((S14 - S14t) < 2.0)){
			S14 = S14t;
		}
		
		if(((S15t - S15) < 2.0) || ((S15 - S15t) < 2.0)){
			S15 = S15t;
		}
		
		if (S16t > 0){
			S16 = S16t;
		}
		if(S16t == 0){
			if (S16 < 20){
				S16 = S16t;
			}
		}
	}
}

If anyone considering about my variables,
S13 = Temperature
S14 = Humidity
S15 = Barometric Pressure
S16 = Lumnicance (wich working fine)

Thanks

A median filter is the simplest, most effective way of eliminating noisy data.

can you post the raw strings with both good and corrupted data?

One of the great things about a modern programming language is that you can choose sensible, meaningful names for your variables.

That way, when you come back to this project in six months, you won't have to remember what those completely useless identifiers mean.

Yeap. I 've read about it, but it is not for me right now.

The problem here is that the message is something like:

<19.1,44.3,951.3,60>

and when parsing data may lost the start point. I realized this when i started to use it and capture everything back then. If you think that this matter, let me capture all the messages again for 24 hours to find it. The measurements arrives every second so you understand that i have an error in 45k cycles. It is not important, but it destroys my graphs.

I can understand you but think that not everyone thinks and acts like you. My project has 70 variables, organized by tenths and more. I could use something like board1_outside_north_temperature but i cannot use it to mysql (especially when i call it from multiple php pages). So when i use an "S13" name, it is global. In arduinos, in mysql, in php.

are you suggesting your code loses sync with the strings being received?

yes, it would be good to collect the raw data to see if it is bad or the processing has a bug.

how big is the buffer collecting these strings? what method is used? how are the "<>" handled?

what is "l"? length of the string -- "<19.1,44.3"?

Fascinating!

Yes. I realized it, as i said, when i started coding it. I eliminated all delays in my project with the help of Robin2 back then. It is a mega that collects from 4 other mega and nano through serial.
The processing has no bug. It may be the distance, the weather, the parallel cables or whatever. This is happening one or two times per day, so there is no bug in my code. However, i re-program tomorrow to start collecting the data.

This buffer is max 23. The method is "Serial Input Basics" you can find here

The "l" you see in my code is a counter for each loop that i use it for many jobs. One of them is the first 10 loops to collect the current data and after that (10th loop) if something is wrong to drop it. Of course it resets back to 10 when reaches 32767.

please capture and post the raw strings

you might consider using readBytesUntil()

it's not at all obvious what "l" is doing

Do you realize that code can be reduced to this?

void b4_parseData() {
	char * strtokIndx4;
	float S13t, S14t, S15t;
	int S16t;

	strtokIndx4 = strtok(tempChars4,",");
	S13 = atof(strtokIndx4);

	strtokIndx4 = strtok(NULL, ",");
	S14 = atof(strtokIndx4);

	strtokIndx4 = strtok(NULL, ",");
	S15 = atof(strtokIndx4);
	
	strtokIndx4 = strtok(NULL, ",");
	S16 = atoi(strtokIndx4);

	if (l >= 10){
	{
		if(((S13t - S13) < 2.0) || ((S13 - S13t) < 2.0)){
			S13 = S13t;
		}
		
		if(((S14t - S14) < 2.0) || ((S14 - S14t) < 2.0)){
			S14 = S14t;
		}
		
		if(((S15t - S15) < 2.0) || ((S15 - S15t) < 2.0)){
			S15 = S15t;
		}
		
		if (S16t > 0){
			S16 = S16t;
		}
		if(S16t == 0){
			if (S16 < 20){
				S16 = S16t;
			}
		}
	}
}

It could be reduced a lot farther using SerialTransfer.h, not to mention the library will detect and ignore corrupt packets and will solve OP's issue.

I 'll try.

You don't give values to SXXt. I am giving values to "t" so to compare to the existing ones.

Yeap. Tested and not worked. Only errors. It can work great in small environment, but unfortunately not in long distances, many variables and complex programming.
I 've tested on my desk with 4 floats. Worked great. I reprogram with 8 floats, only send (not receive - full duplex), about 22meters cable and i got only errors.

It is a very big conversation how to eliminate the errors. This is not what i am looking for. I 've done my research a long ago and decided that it is not possible to clear all in real conditions. It would be very difficult to determine if the errors become from a noisy environment or something else. My next upgrade is to revert Serial Connections to RS485 to test with 4 cables (full duplex), which has higher voltages, to see if there would be zero errors. As you can see in my graph some posts before you will notice 2 error per day, with cycles per second. It is 2 per 86400. It is small. And definitely i can drop these 2 errors. This is what i am looking for.
As i said i am taking measurements from another 3 arduinos, i have a "threshold" error (depends the sensor) and drops it if it arrives outside of this. This is working fine for my project, because in the next second i will have the right value.
Anyway, thank you all for your answers.

What I posted EXACTLY replicates the functionality of your code in the first post of this thread.

if (fabs(S14t - S14) < 2.0) S14 = S14t;

Great. Thank you!

If you insist...

gcjr i logged the entire message to mysql and did some research today. First of all my code...

Inside loop:

	board4_receive();
	if (newData4 == true) {
		data_log (); //this is temporary to capture the message
		strcpy(tempChars, receivedChars);
		b4_parseData();
		newData4 = false;
		start4 = millis();
	}
	else {
//some functions to alert me of not working
}

Now the function to receive:

void board4_receive() {
	static boolean recvInProgress = false;
	static byte ndx = 0;
	char startMarker = '<';
	char endMarker = '>';
	char rc;

	while (Serial2.available() > 0 && newData == false) {
		rc = Serial2.read();
		if (recvInProgress == true) {
			if (rc != endMarker) {
				receivedChars4[ndx] = rc;
				ndx++;
				if (ndx >= numChars) {
					ndx = numChars - 1;
				}
			}
			else {
				receivedChars[ndx] = '\0'; // terminate the string
				recvInProgress = false;
				ndx = 0;
				newData = true;
			}
		}
		
        else if (rc == startMarker) {
			recvInProgress = true;
		}
	}
}

The parsing data already known.

What i saw in mysql was:

3.5,80.4,961.3,0 	2022-02-24 06:40:01
9.5,961.4,0 	2022-02-24 06:40:03
3.5,79.0,961.3,0 	2022-02-24 06:40:06

What i see here is that i am missing 5 chars from the beginning.
Let me explain what are these values.

  1. The first is outside temp, which can have values between -10.0 to 38.0 (3 to 5digits max)
  2. The second is outside Humidity, which have values between xx.x to 100.0 (5digits)
  3. The third is barometric which is a bit stable around 960.0 (5 digits)
  4. The last one is lumnicance, which have values between 0 to 10000 (1 to 5 digit max)
    So we have minimum 14 and maximum 20 (so i am using const byte numChars4 = 22;)

Any suggestions?

Thank you.

need to have reliable routine to receive strings. it's not a parsing issue if the data is not received correctly

why is there a receivedChars4 and receivedChars?

why bother, something must be wrong?

always leary of receive code using flags (e.g. newData, receiveInProgress).

consider

int
rx (
    char *buf,
    int   size,
    char startMarker,
    char endMarker )
{
    static int  idx = 0;

    while (Serial2.available ())  {
        char rc = Serial2.read ();

        if (rc == startMarker)
            idx = 0;

        else if (rc == endMarker)
            return idx;

        else  {
            buf [idx++] = rc;
            if (size == idx)
                return idx;
        }
    }
    return 0;
}

for following data

char buf [] =
      "<10.1,11.2,12.3,13.3>"
      "<20.1,21.2,22.3,23.3>"
      "<30.1,31.2,32<.3,33.3>"
      "<40.1,41.2,42.3,43.3>";

rx recognizes

 main: 10.1,11.2,12.3,13.3
 main: 20.1,21.2,22.3,23.3
 main: .3,33.3
 main: 40.1,41.2,42.3,43.3

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.