Compressing data from sensors

I have a project that has up to 40 DS18B20 sensors on a chain, the data needs to be collected and sent via a Satellite up link. Now the sensors are working perfectly and I am formatting the data as a JSON sting (i have tried both packed and unpacked json.

The Sat service charges 1 credit per 50 bytes of data so I need to squish the data down as small as possible. She sensors need to be set to output in 12bit resolution.

The JSON sting is something like this...

{"NODE1":[-12.99,12.123,99.99,133.999,999.009,99.99,12.123,123.99,133.999,99.99,99.99,12.123,123.99,133.999,999.009,99.99,12.123,123.99,133.999,99.99,99.99,12.123,123.99,133.999,999.009,99.99,12.123,123.99,133.999,99.99,99.99,12.123,123.99,133.999,55.009,99.99,12.123,123.99,26.9997]
}

At present even when packing the data its about 170bits.

because the resolution is 12bit all the sensors are likely to have some variation to the last reading. the system also goes in to a deep sleep between each reading do conserve power so holding the last reading in memory and just sending the variation may not work that well unless i write the last reading to eprom and that has its own issues and will impact the devices longevity in the field.

Has anyone got any creative ideas how I could tackle this problem?

Regards
Andrew

What is the MIN & MAX temps you would ever expect to see? Do you really need 1/16 degree C resolution?

Have you considered sending the data as binary?

40 float values would be 160 bytes. Add some additional bytes for an identifier,, length indication and crc and I'm quite sure it will be smaller than what you send now. Creating jason from that needs to happen at the receiving end.

In case you decide to stick with ascii, do you need N decimal digits, or will e.g. xx.y be sufficient?

Have you considered sending the data as binary?

40 float values would be 160 bytes. Add some additional bytes for an identifier,, length indication and crc and I'm quite sure it will be smaller than what you send now. Creating jason from that needs to happen at the receiving end.

Even better, send 40 int16_t values as received from the sensors and do all conversions at the receiving end.

In case you decide to stick with ascii, do you need N decimal digits, or will e.g. xx.y be sufficient?

sterretje:
Have you considered sending the data as binary?

40 float values would be 160 bytes. Add some additional bytes for an identifier,, length indication and crc and I'm quite sure it will be smaller than what you send now. Creating jason from that needs to happen at the receiving end.

In case you decide to stick with ascii, do you need N decimal digits, or will e.g. xx.y be sufficient?

Consider Packed Decimal. Each character (at least in Arduino) is 8-bits, but the numbers 0-9 can be represented in 4-bits, so pack each two digits into one char. This should cut your data size almost in half.

123.99 = 6 chars * 8 bits = 48 bits
000100100011111110011001 = 24 bits.

You could send the whole thing as a single stream of bytes. In my example I used 0xf (1111) for the comma delimiter.

On the receive end, split the packed decimal back to two chars and then reassemble the original data.
0001 0010 = 12
0011 1111 = 3,
1001 1001 = 99

Does this help?

At present even when packing the data its about 170bits.

Did you mean 170 bytes? If it is truly 170 bits, that's going to be very hard to beat. The raw data is 12 bits X 40 = 480 bits or 60 bytes. If you meant 170 bytes, that's still only 4 bytes per sensor, which is pretty good I think.

unless i write the last reading to eprom and that has its own issues and will impact the devices longevity

How often is the data sent? EEPROM has a life of at least 100,000 write cycles. If sending once an hour, that's over 10 years.

The data sheet says:

±0.5°C Accuracy from -10°C to +85°C

Given that, is it worth sending more than one decimal place? And if it isn't, maybe you could send variance data instead of absolute.

Are any of these sensors close together? Can you rely on them having a similar temperature enough that you always send variance data for one of them?

Also, can you make any guarantees about the range of temperatures you'll see? I assume that you don't actually get readings across the entire range the sensor can manage. Might be possible to reduce the data that way.

I like the 60 byte solution - let the far end process your twelve bits. I do think though that the final (or indeed two final) bits are just noise. Dropping them gives you a neat 50 bytes for a single credit.

If a 1/4 degree C resolution was enough, 8 bits would give a 64 degree temperature span. For example, B0000 0000 could be -14 C and B1111 1111 could be 49.75.

Why not only send data when a value changes , that would save a lot of data - idea used in sending highly compressed video .

40 sensors is a lot ....

JCA34F:
What is the MIN & MAX temps you would ever expect to see? Do you really need 1/16 degree C resolution?

Yes, the sensors are being installed into a bore hole on the side of an active volcano, temperatures are expected to be -30 on the surface, and the estimated temperature inside the hole is expected to be about 90 to 100 degrees.

sterretje:
Have you considered sending the data as binary?

40 float values would be 160 bytes. Add some additional bytes for an identifier,, length indication and crc and I'm quite sure it will be smaller than what you send now. Creating jason from that needs to happen at the receiving end.

Even better, send 40 int16_t values as received from the sensors and do all conversions at the receiving end.

In case you decide to stick with ascii, do you need N decimal digits, or will e.g. xx.y be sufficient?

I am looking at using mesgpack for that, did a test and its reducing the data a little but not quite as much as I would have thought.

SteveMann:
Consider Packed Decimal. Each character (at least in Arduino) is 8-bits, but the numbers 0-9 can be represented in 4-bits, so pack each two digits into one char. This should cut your data size almost in half.

123.99 = 6 chars * 8 bits = 48 bits
000100100011111110011001 = 24 bits.

You could send the whole thing as a single stream of bytes. In my example I used 0xf (1111) for the comma delimiter.

On the receive end, split the packed decimal back to two chars and then reassemble the original data.
0001 0010 = 12
0011 1111 = 3,
1001 1001 = 99

Does this help?

I have been looking in to DPD or using Radix-50 to pack the data down. time wise however I have not found a library for either. I have also been looking at Huffman Coding as that looked like it may squish it down a bit. I am kind of surprised there are no libraries for this.
I also need to be sure that whatever i compress it with, on the web side it can be unconpressed however most things seem to work in javascript these days.

PaulRB:
Did you mean 170 bytes? If it is truly 170 bits, that's going to be very hard to beat. The raw data is 12 bits X 40 = 480 bits or 60 bytes. If you meant 170 bytes, that's still only 4 bytes per sensor, which is pretty good I think.
How often is the data sent? EEPROM has a life of at least 100,000 write cycles. If sending once an hour, that's over 10 years.

Oops, Yes sorry 170bytes
The sensors record the data every 5 minutes and store it to a CF device, every half hour to hour or so I was hoping to also send one data set to the remote database using the sat link. The unit is not in an easy place to get to and the next team will not be there for a year.

If the CF fills up a warning will be sent, at that point they have the option of downloading the data or just clearaing the previous data block.

wildbill:
Also, can you make any guarantees about the range of temperatures you'll see? I assume that you don't actually get readings across the entire range the sensor can manage. Might be possible to reduce the data that way.

I like the 60 byte solution - let the far end process your twelve bits. I do think though that the final (or indeed two final) bits are just noise. Dropping them gives you a neat 50 bytes for a single credit.

At this stage no sensors have ever been put in the area so no one has any idea as to what the variation is going to be. given the air teperature drops to as low as -55 in the area and the ground temperature is averaging +60 degrees anyting could happen... (its an active volcano) we may even find that the sensors fail due to the extreme heat however budget wise they were the only option for this project.

Each sensor is 200mm appart.

The initial goal was to use Lora however the base is 31Km from the site and there is not clear line of sight. also the wind at the location is INSANE with speeds of 200Km/h being recorded. Apparently the did have a repeater up there that was bolted in to the ground and it just vanished.

hammy:
Why not only send data when a value changes , that would save a lot of data - idea used in sending highly compressed video .

40 sensors is a lot ....

Funny you should say that, last night I thought about just sending the variation.. +-0.xxxx and seeing how much that reduced the data. so if the last temperature recorded was 0.44 degreese abote the last you are only sending 0.44 rather than 12.44 etc... so using a simple multiplication factor could work perfectly.

Don't forget that +-0.5 degree accuracy. You've got a lot more precision than that, but it is probably counterproductive to use it. I suggest that you test your sensors and see how well they agree with each other at various different temps you're expecting to measure. I suspect you'll find that they don't match well at all.

Since I assume that you'll be comparing sensor data across your array, use of high precision measurements may show you things that aren't really there.

Be aware too that the sensors heat themselves when in use - I had variation of two degrees above alleged ambient when I was using these sensors continuously to monitor my house temperature.

lower power CPU. the ESP32 has a lower power CPU, the STM32 as well.
the memory is still powered.
Not sure how much less the data points would be if they were just the difference.
Just tossing another thing on the table.

Another thing to consider is the effect of a limited range. If the temperature in the bore holes really does turn out to fall between 90 and 100, you could decide that measuring in the range 88 to 103 is sufficient. Eight bits would get you the ability to express temperatures across that range in a 16th of a degree.

But what if I get a freak reading? Have an 'A' record that holds 8 bits for every sensor for the usual case but also have a 'B' record that holds 12. Maybe something in between the two as well.

If you accept that you only have 0.5 degree accuracy, 5 bits would do.

I suspect though that you will want to see temperature change at each borehole in greater precision than that, even if there's no guarantee that that number is directly comparable to the sensor in the next hole. You might be able to calibrate your way out of that issue at least to some extent.

wildbill:
Don't forget that +-0.5 degree accuracy. You've got a lot more precision than that, but it is probably counterproductive to use it.

No, on the contrary the DS18B20 sensors are much more accurate than that at/near room temperature,
I've measured a standard deviation between devices around 0.1C, and each sensor's self-consistency
may be considerably better than that. The 0.5 figure is across the full temperature range. (*)

Its always best to send full raw data from sensors, any decision about accuracy and precision to present
data is done when presenting, not when collecting, sending, processing.

Lossless compression of a sequence of related temperature datapoints is usually highly efficient, as you
only need to consider the differences. If there are occasional lost packets you'll probably finesse that by
sending absolute values every so often so recovery from that point is possible.

(*) Note that temperature sensor accuracy is only valid if the sensors are properly thermally bonded to the
object they are measuring. Bare epoxy-encapsulated DS18B20's will pick up radiant heat from the
environment, very noticeable on the bench if you place a hand near. Packaging the sensors in metal tube
is often done to greatly reduce this sensitivity.