Go Down

Topic: Optimisation Flash and RAM usage (Read 742 times) previous topic - next topic

aarg

#15
Jul 30, 2020, 02:33 am Last Edit: Jul 30, 2020, 02:39 am by aarg
Is it essential that your telemetry be human readable? If not, you can format your transmissions using a much simpler protocol like CSV, you will release some memory and also your transmissions will be faster. Then, at the receive end, you can run the data through a conversion program that formats everything nicely for reading.

Some of your code can be further optimized, this is from TelemFunctions.h:
Code: [Select]
if (dbm_telemetry == 0) dbm_telemetry = 0;
  else if (dbm_telemetry == 1) dbm_telemetry = 3;
  else if (dbm_telemetry == 2) dbm_telemetry = 7;
  else if (dbm_telemetry == 3) dbm_telemetry = 10;
  else if (dbm_telemetry == 4) dbm_telemetry = 13;
  else if (dbm_telemetry == 5) dbm_telemetry = 17;
  else if (dbm_telemetry == 6) dbm_telemetry = 20;
  else if (dbm_telemetry == 7) dbm_telemetry = 23;
  else if (dbm_telemetry == 8) dbm_telemetry = 27;
  else if (dbm_telemetry == 9) dbm_telemetry = 30;
  else if (dbm_telemetry == 10) dbm_telemetry = 33;
  else if (dbm_telemetry == 11) dbm_telemetry = 37;
  else if (dbm_telemetry == 12) dbm_telemetry = 40;
  else if (dbm_telemetry == 13) dbm_telemetry = 43;
  else if (dbm_telemetry == 14) dbm_telemetry = 47;
  else if (dbm_telemetry == 15) dbm_telemetry = 50;
  else if (dbm_telemetry == 16) dbm_telemetry = 53;
  else if (dbm_telemetry == 17) dbm_telemetry = 57;
  else if (dbm_telemetry == 18) dbm_telemetry = 60;


That just screams, "please put me in a look up table!". If it's derived from a formula, you might just calculate the value. It looks very close to
Code: [Select]
dbm_telemetry *= 3;
  ... with a transistor and a large sum of money to spend ...
Please don't PM me with technical questions. Post them in the forum.

westfw


KevWal

#17
Jul 30, 2020, 10:51 am Last Edit: Jul 30, 2020, 10:53 am by KevWal
Hi Aarg
Is it essential that your telemetry be human readable?

Some of your code can be further optimized, this is from TelemFunctions.h:
Code: [Select]
if (dbm_telemetry == 0) dbm_telemetry = 0;
  else if (dbm_telemetry == 1) dbm_telemetry = 3;
  else if (dbm_telemetry == 2) dbm_telemetry = 7;


That just screams, "please put me in a look up table!".
The format for the WSPR messages is very fixed unfortunately, and is already a fine balance of meeting the protocol specifications whilst encoding as much information as possible.

The dbm_telemetry for example can only use the exact list on the right hand side of the code above, but now implemented as a lookup saves me 100 bytes - thank you for the idea - it all counts :)

Code: [Select]
  static const PROGMEM uint8_t dbm_lookup[19] ={3,7,10,13,17,20,23,27,30,33,37,40,43,47,50,53,57,60};
  dbm_telemetry = pgm_read_byte(dbm_lookup + dbm_telemetry);

Thanks very much
Kevin

KevWal

Well, I have replaced the SI5351 library (thanks westfw for the links) and that saved me 10k of code and 100 bytes flash.  I would still like to find a bit more flash saving, but looking a lot more healthy now.

Thanks all.

Cheers
Kev

westfw

Post your new code (and a pointer to the library) and we'll continue to analyze, if you want.  It's an interesting problem!
(A .zip file of the sketch directory would work better than individual files...)

There are some constant tables in JTEncode that could go into PROGMEM...

westfw

It looks like tinyGPS++ stores most of the values it gathers as integers, and then converts them to float values depending on which method you call (so "native" form for speed is apparently knots, and when you call gps.speed.kmps() it takes the integer value, converts from knots to kmps using float arithmetic, and returns the float, which you then immediately convert back to an integer for itoa() to put it in your data.
I think you can access the native data using "gps.speed.value()", and then do any conversions you need with integer math...
(This applies even to latitude/longitude, which is stored in degrees, minutes, and billionths of minutes...)


Saving code space by eliminating floating point calculations pretty much requires that you eliminate ALL of the floating point code.  In addition to the (unnecessary?) conversions on some of the gps data, you have:
Code: [Select]
  long f = float(e / 17576L);
(which is particularly pointless), and:
Code: [Select]
  temp = (ADCW - 322.2 ) / 1.43; // ADCW 16 bit reg to degrees Celsius  //TODO Calibrate these values
in TelemFunctions.h

aarg

#21
Jul 31, 2020, 04:43 am Last Edit: Jul 31, 2020, 04:56 am by aarg
Float with an FPU, great..
Float with at least 64 bit precision, good.
Float with 32 bits, at least you can do it.

So, what you're actually getting in a float is automatic scaling. That happens to prevent overflow, which is a good idea. But every time it happens, some truncation happens. With float, it's hidden and you have to deduce it. If you do it with scaled integers, you are doing almost the same thing except you can see and control the order of truncation explicitly. Because the float code can't possibly know what you are going to do, it has to be generic and that adds overhead. The main reason a lot of people don't sweat it, is because the machine they're on has 64 bits and an FPU.

Thought... I guess the transcendental functions like sin(), log() etc. etc. don't come easy in integer form. Although doubtless it is possible. Maybe there are some libraries somewhere? Or it could be written as needed. Some engineering formulas get pretty thick with those, though. But I found out on some archaic 8 bit machines that generally using integer math vs. float was a 10 to 1 improvement in speed. That was in BASIC! As mentioned above, it also uses less memory.

A lot of Arduinos can do 64 bit float passably well in spite of having no FPU, having 32 bits CPU and clock speeds around 50-200 MHz.
  ... with a transistor and a large sum of money to spend ...
Please don't PM me with technical questions. Post them in the forum.

KevWal

Hi Westfw

How are you telling that?  I don't even see atoi() in the binary.  (maybe you mean atol()?)
I think it can be improved by having your own special case code that isn't as careful...
I found the reference again here:  https://playground.arduino.cc/Code/PrintingNumbers/

"The stdlib itoa() library routine adds around 600 bytes,"

Thanks
Kev

westfw

Quote
I found the reference again here:  https://playground.arduino.cc/Code/PrintingNumbers/
"The stdlib itoa() library routine adds around 600 bytes,"
Well, that appears to be wildly inaccurate!  I get about 120 bytes (on either AVR or SAMD21, which is a neat trick!  (apparently it's only 120bytes on SAMD because the division function is already present in an empty sketch...))

Code: [Select]
char buffer[20];
volatile int x;
void setup() {
  buffer[0] = x;
  itoa(x, buffer, 10);
}
void loop() {}

KevWal

Saving code space by eliminating floating point calculations pretty much requires that you eliminate ALL of the floating point code.  In addition to the (unnecessary?) conversions on some of the gps data, you have:

Code: [Select]
 temp = (ADCW - 322.2 ) / 1.43;
Could you help me understand this please?  ADCW is uint16, temp is int8, are 322.2 and 1.43 stored as floats and hence floating point arithmetic?  

322.2 can be 322, as I only need 2 significant figures, but 1.43 I am a bit stuck as to how to do differently?

The total relevant code:
Code: [Select]

  temp = (ADCW - 322.2 ) / 1.43; // ADCW 16 bit reg to degrees Celsius

  if (temp < -50) temp = -50;  // protect the encoding code
  if (temp > 39) temp = 39; // Going 'via' Celsius seems sensible for debugging and calibration

  temp = temp + 50; // scale -50 to 39, to 0 to 89, representing -50C to 39C

  // Now encode into loc_telemetry and dbm_telemetry
  long x = (temp * 40L ) + solarVolt;
etc


I don't want to touch the encoding code as I dont understand that...

Thanks very much
Kevin

david_2018

The math can be done using only integers.

Code: [Select]

  temp = (ADCW * 100l - 32220l) / 143l;


I would be careful declaring temp as int8_t, that assumes ADCW will be within a certain range of values, something that you are not checking.  A failed temperature sensor may give a reading outside the expected range.

KevWal

There are some constant tables in JTEncode that could go into PROGMEM...
Hi Westfw

I have put those into Progmem and changed to statics, saved me 80 bytes of Sketch and 160 bytes of flash, Thank you :)

A lot smaller win, but is it easy to anything with these, I couldnt figure out how:

Code: [Select]
void JTEncode::ft8_merge_sync_vector(uint8_t* symbols, uint8_t* output)
{
 const uint8_t costas7x7[7] = {3, 1, 4, 0, 6, 5, 2};
 const uint8_t graymap[8] = {0, 1, 3, 2, 5, 6, 4, 7};
 uint8_t i, j, k, idx;

 // Insert Costas sync arrays
 memcpy(output, costas7x7, 7);
 memcpy(output + 36, costas7x7, 7);
 memcpy(output + FT8_SYMBOL_COUNT - 7, costas7x7, 7);


Thanks
Kev

KevWal

The math can be done using only integers.
Code: [Select]
 temp = (ADCW * 100l - 32220l) / 143l;
I would be careful declaring temp as int8_t, that assumes ADCW will be within a certain range of values, something that you are not checking.  A failed temperature sensor may give a reading outside the expected range.
Thanks David, good to understand, but in my case that actually increases code size by 25 bytes.  Presumably if I could get 100% of floats out that would be a decrease, but as they are buried in the libraries that isn't going to happen.

I am still not clear on what the 'l's do and how this is treated internally though.  ADCW is unsigned 16 bits I think, 0 to 65,024, lets us say it is set by the ADC at 1,000 * 100 = 100,000 - overflowing a 16bit int?

Thanks very much
Kev

david_2018

The upper or lower case L tells the compiler to do the math using a long int, 32 bits on the atmega328, so that it will not overflow.

david_2018

You will need to test this to verify it works properly, although I see no change in memory usage so your code may never use the function.

Code: [Select]

void JTEncode::ft8_merge_sync_vector(uint8_t* symbols, uint8_t* output)
{
 static const uint8_t PROGMEM costas7x7[7] = {3, 1, 4, 0, 6, 5, 2};
 static const uint8_t PROGMEM graymap[8] = {0, 1, 3, 2, 5, 6, 4, 7};
 uint8_t i, j, k, idx;

 // Insert Costas sync arrays
 memcpy_P(output, costas7x7, 7);
 memcpy_P(output + 36, costas7x7, 7);
 memcpy_P(output + FT8_SYMBOL_COUNT - 7, costas7x7, 7);

 k = 6;
 for(j = 0; j < 58; ++j) // 58 data symbols
 {
 i = 3 * j;
 ++k;
 if(j == 29)
 {
 k += 7;
 }
 idx = symbols[i] * 4 + symbols[i + 1] * 2 + symbols[i + 2];
 output[k] = pgm_read_byte(&graymap[idx]);
 }
}



Go Up