Reducing Prog Flash size

Just sharing my own experiences while reducing the flash size of our Beat707 software.

Here's a fun chart I created while playing around with some changes. ;-)

  Start:                 Flash = 27936 - Memory = 334
  Removed OK (Print)     Flash = 27930 - Memory = 330
  Removed OK (Write)     Flash = 27932 - Memory = 334
  Removed dmNote XT      Flash = 27516 - Memory = 333
  Removed more LCD Print Flash = 27486 - Memory = 343
  Removed more LCD Print Flash = 26926 - Memory = 361
  Fixed Bad LCD code     Flash = 26906 - Memory = 361
  Changed INIT Values    Flash = 26934 - Memory = 362
  Changed Note Names     Flash = 26818 - Memory = 362
  Changed lcd.print      Flash = 26168 - Memory = 386

The biggest change was to remove all Serial.print and replace with Serial.write. Keep in mind that to use write(0) you need to do this:

Serial.write((byte)0);

Another big change was to change the way we store Strings in the Prog Flash, how to read, and also to get rid of lcd.print and use lcd.write with our own functions. I will post the functions below.

void lcdPrint(uint8_t pos)
{
  uint8_t c;
  char* p = (char*)pgm_read_word(&(stringlist[pos]));
  while (c = pgm_read_byte(p)) { lcd.write(c); p++; }
}

void lcdPrintString(char* string)
{
  uint8_t p = 0;
  while (string[p] != 0) { lcd.write(string[p]); p++; }
}

void lcdPrintNumber(uint8_t number)
{
  lcd.write('0'+(number/10));
  lcd.write('0'+(number-((number/10)*10)));
}

void lcdPrintNumber3Dgts(uint8_t number)
{
  if (number >= 200) { lcd.write('2'); number -= 200; }
    else if (number >= 100) { lcd.write('1'); number -= 100; }
    else lcd.write('0');
  lcdPrintNumber(number);

I will also post the code for the String List which the above functions uses...

Hope this helps someone else too. 8)

Best Regards, WilliamK

// GM Drum Set //
  prog_char string_1[] PROGMEM   = "AcBass";  // 35
  prog_char string_2[] PROGMEM   = "Bass";    // 36
  prog_char string_3[] PROGMEM   = "Stick";   // 37
  prog_char string_4[] PROGMEM   = "Snare";   // 38
 ......
prog_char myStrings_001[] PROGMEM  = "ReceivingSysEx";
#define RECEIVING_SYSEX 40
prog_char myStrings_002[] PROGMEM  = "Shift to Confirm"; 
#define SHIFT_TO_CONFIRM 41

prog_char myStrings_017[] PROGMEM  = "Track Selection";
#define TRACK_SELECTION 56
prog_char myStrings_018[] PROGMEM  = "Mute Tracks";
#define MUTE_TRACKS 57
prog_char myStrings_019[] PROGMEM  = "Solo Tracks";
#define SOLO_TRACKS 58

PROGMEM const char *stringlist[] = { empty_Str, 
  string_1, string_1, string_1, string_1, string_2, string_3, string_4, string_5, string_6, string_7, string_8, string_9, string_10, string_11, string_12, string_13, string_14, string_15, 
  string_16, string_17, string_18, string_19, string_20, string_21, string_22, string_23, string_24, string_25, string_26, string_27, string_28, string_29, string_30, string_31, string_32, string_33, string_34, string_35, string_36, 
  myStrings_001, myStrings_002, myStrings_003, myStrings_004, myStrings_005, myStrings_006, myStrings_007, myStrings_008, myStrings_009, myStrings_010, myStrings_011,
  myStrings_012, myStrings_013, myStrings_014, myStrings_015, myStrings_016, myStrings_017, myStrings_018, myStrings_019, myStrings_020, myStrings_021, myStrings_022,
..........};

And I use the #defines number in most of the cases. Here's an example:

    if (shiftMode == 0) lcdPrint(TRACK_SELECTION);
      else if (shiftMode == 1) lcdPrint(MUTE_TRACKS);
      else if (shiftMode == 2) lcdPrint(SOLO_TRACKS);

I removed portions of the code so its easier to understand.

Wk

Now, of course, I'm open to ideas on how to reduce even more the flash usage. ;-)

Wk

You need to check if strings need the length they have.

e.g "Track Selection" => "select track" although its only 3 bytes it adds up.

Its a balance between cryptic and shortness and readabble and long

Sorry, I do not follow, care to elaborate? :blush:

Wk

Reduce the size of your text strings, every char removed is a byte spared.

PROGMEM const char *stringlist[] = { empty_Str, string_1, string_1, string_1, string_1, string_2, string_3, string_4, string_5, string_6, string_7, string_8, string_9, string_10, string_11, string_12, string_13, string_14, string_15,

Well, it sounds like the arduino core is pretty small compared to the code you've actually written, so it's hard to say without seeing what you've actually done overall, but if you have fewer than 256 instruments, you might be able to save significant space by adding an extra level of indirection:

PROGMEM const char *inst_names[] = {str1_name, str2_name, str3_name, ...};
#define string_1 ((byte)0)
#define string_2 ((byte)1)
   :
#define empty_str ((byte)255)

PROGMEM const char stringlist[] = { empty_Str, 
  string_1, string_1, string_1, string_1, string_2, string_3, string_4, string_5, string_6, string_7, string_8, string_9, string_10, string_11, string_12, string_13, string_14, string_15,

@senso: 100% right.

“Track Selection” ==> “Track Sel.”;
“Mute Tracks” => “Mute”
“Solo Tracks” => “Solo”

Similar you can re-engineer other datastructures (midi beats?) and the algorithms. but sometimes compacter data means complexer algorithms.

E.g. for a ‘piano’ one can create a frequency table of 88 floats to have all frequencies ==> 88 x 4bytes = 352 bytes
An approximation is 88 ints meaning 88 x 2 = 176 bytes. Spared 176 bytes!

Another option is to code only the 12 notes in the highest octave(8) 12 x 4 = 48 bytes and calculate the notes from every other octave from this master.
as approximation 12 ints => 24 bytes. (you spare 352 - 24 = 328 bytes in the data!)

The code to calc the frequency becomes :
freq = freqMaster[ tone ] / (1 << (8-octave));

This code is a bit complexer than just freq = freqTable[tone]; but it will add less bytes than the 328 bytes spared.

Have you published the code somewhere?

Heh. You've succeeded in proving that "C++ produces bloated code", since the changes you've implemented essentially replace the Print class with simpler C functions. You can probably save a few more bytes by implementing C-like functions that replace Serial and LCD entirely, though I'm not sure that that would be worth it.

You might consider putting a "#define PROGMEM " at the beginning of your code, causing all the flash data you've defined to end up in ram instead (as well!) Of course there won't be enough ram, but a quick look at avr-size will give you an indication of how much space is being used by data structures vs code...

Beat707 is "just" a midi sequencer/user interface, right? So actual frequencies and floating point math are all absent?

Midi has 127 notes, right? It'd be nice to figure out how to use that extra bit. (But you probably don't have any note tables in flash, anyway?)

Good ideas, thanks. And yes, its all midi based, 0 to 127, so mostly char and uint8_t. ;-)

I did get rid of more LCD stuff and ended up doing a clone of the LiquidCrystal and removing a bunch of stuff this project wouldn't use. (since the hardware is set in stone anyway) Right now Flash takes 24K, a pretty good drop. 8)

Flash = 24762 - Memory = 404

I also uses the 16-bit timer, which uses less code compared to the previous 8-bit timer code I was using, taken from the Tone.cpp file. :open_mouth:

There's just one float number and math in the whole thing.

void timerStart()
{
  TCCR1A = TCCR1B = 0;
  bitWrite(TCCR1B, CS11, 1);
  bitWrite(TCCR1B, WGM12, 1);
  timerSetFrequency();
  bitWrite(TIMSK1, OCIE1A, 1);
}

void timerSetFrequency()
{
  // Calculates the Frequency for the Timer, used by the PPQ clock (Pulses Per Quarter Note) //
  // This uses the 16-bit Timer1, unused by the Arduino, unless you use the analogWrite or Tone functions //
  float frequency = ((float(midiClockBPM)*float(PPQ))/60.0f);
  OCR1A = F_CPU / frequency / 8 - 1;
}

void timerStop(void)
{
  bitWrite(TIMSK1, OCIE1A, 0);
  TCCR1A = TCCR1B = OCR1A = 0;
}

Wk

There’s just one float number and math in the whole thing.

Well, if you can get rid of that one float calculation, it looks like it would save about 1000 bytes…
(that assumes that you get rid of the floating point internal libraries, and that you already have enough fixed point math that you don’t end up adding more back in to replace it…)
You need a 16 bit integer eventually; it might be possible to do all 16bit math.
BPM is about 100-200 and PPQ is 96 (? Is this a constant (2n*3m, right? Ah, the music theory returns…) BPM is the only variable where it’s desirable to have it vary “continuously.”
Hmm.
OCR1A = (( 16e6 * 60 / 8 ) / PPQ / BPM ) - 1;
No, I guess it won’t fit in 16bit math; the range of BPM is too broad. 32 bits would be plenty, though.
A quick check says the above is still about 1000 bytes smaller than the floating point code in a trivial sketch.
(I didn’t check whether it comes up with identical results…)

int midiClockBPM, PPQ;

void setup()
{
  midiClockBPM = Serial.read();
  PPQ = Serial.read();
}
#define USEINT
#ifdef USEINT
int timerSetFrequency()
{
  // Calculates the Frequency for the Timer, used by the PPQ clock (Pulses Per Quarter Note) //
  // This uses the 16-bit Timer1, unused by the Arduino, unless you use the analogWrite or Tone functions //
 return (F_CPU * 60 / 8) / PPQ / midiClockBPM - 1;
}
#else
int timerSetFrequency()
{
  // Calculates the Frequency for the Timer, used by the PPQ clock (Pulses Per Quarter Note) //
  // This uses the 16-bit Timer1, unused by the Arduino, unless you use the analogWrite or Tone functions //
  float frequency = ((float(midiClockBPM)*float(PPQ))/60.0f);
  return F_CPU / frequency / 8 - 1;
}
#endif

void loop()
{
  Serial.print(timerSetFrequency());
  while (1) ;
}

Thanks, good idea, removing all float reference did reduce the whole thing down 1.2K :fearful: Just testing now if the math is still good. ;-)

Wk

Seems to be working, I just hope I didn't do anything stupid with this. :blush:

void timerSetFrequency()
{
  // Calculates the Frequency for the Timer, used by the PPQ clock (Pulses Per Quarter Note) //
  // This uses the 16-bit Timer1, unused by the Arduino, unless you use the analogWrite or Tone functions //
  #define frequency (((midiClockBPM)*(PPQ))/60)
  OCR1A = F_CPU / frequency / 8 - 1;
}

Seems to be working,

Write a small sketch that compares the float math with the int math, thats where computers are good at. Then you know it is OK (or not)

for (all values ) { float diff = timerSetFrequencyINT() - timerSetFrequencyFLOAT(); if (abs(diff) > THRESHOLD) print (params); }

How much different frequencies are there anyway? Considered a lookup table?

minor one OCR1A = F_CPU / frequency / 8 - 1; ==> F_CPU/8/frequency -1 and F_CPU/8 can be optimized by the compiler as both are constant