HELP with making code size smaller for Attiny13a

Hello, Iam making an awesome christmas present a custom pcb badge.
Everything works great with hooked up uno, but its heart is actually a Attiny13a, where I get ONLY around
1024bytes (mine code has 1944)
64bytes of variabels (I somehow managed to make few things copy on run so i barely get to it at 59 bytes)

The thing is how can I make this even smaller??

void setup() {
  pinMode(3, INPUT_PULLUP);
}

void charlie(int a[])
{
  
  if(a[0] < 0){
    
    pinMode(5, INPUT);
  }else
  {
    pinMode(5, OUTPUT);
    digitalWrite(5, a[0]);    
  }   
  if(a[1] < 0){
    
    pinMode(6, INPUT);
  }else
  {
    pinMode(6, OUTPUT);
    digitalWrite(6, a[1]);    
  }
  if(a[2] < 0){
    
    pinMode(2, INPUT);
  }else
  {
    pinMode(2, OUTPUT);
    digitalWrite(2, a[2]);    
  }
  if(a[3] < 0){
    
    pinMode(4, INPUT);
  }else
  {
    pinMode(4, OUTPUT);
    digitalWrite(4, a[3]);    
  }
}

void copy(int* src, int* dst, int len) {
    memcpy(dst, src, sizeof(src[0])*len);
}
void resetP()
{
  pinMode(5, INPUT);
  pinMode(6, INPUT);
  pinMode(2, INPUT);
  pinMode(4, INPUT);  
}
void loop() {

  int koz1[4] = {-2,LOW,HIGH,-2};
  //int koz2[4] = {-2,HIGH,LOW,-2};
  
  int koz2[4];
  copy(koz1, koz2, 4);
  koz2[1] = HIGH;
  koz2[2] = LOW;
  
  int koz3[4] = {LOW,HIGH,-2,-2};
  //int koz4[4] = {HIGH,LOW,-2,-2};
  int koz4[4];
  copy(koz3, koz4, 4);
  koz4[0] = HIGH;
  koz4[1] = LOW;

  
  int koz5[4] = {LOW, -2,-2,HIGH};
  //int koz6[4] = {HIGH, -2,-2,LOW};
  int koz6[4];
  copy(koz5, koz6, 4);
  koz4[0] = HIGH;
  koz4[3] = LOW;
  
  int cap[4] = {-2, HIGH,-2,LOW};
  
  
  int oko1[4] = {LOW, -2,HIGH,-2};
  //int oko2[4] = {HIGH, -2,LOW,-2};
  
  int oko2[4];
  copy(oko1, oko2, 4);
  oko2[0] = HIGH;
  oko2[2] = LOW;
  
  
  // put your main code here, to run repeatedly:
   
  static long oldTime = 0;  
  long Ctime = millis();  
  long diffT = Ctime - oldTime;
  oldTime = Ctime;


  static bool holdButt = false;
  static long holdTime = 0;
  static long shortPressATime = 0;
  //static long longPressATime = 0;
  static bool kozichOn = true;

  //charlie(oko1);
  //charlie(oko2);
  
  
  if(digitalRead(3) == LOW)
  {
    holdButt = true;
    holdTime = holdTime + diffT;
  }
  else if (holdButt && digitalRead(3) == HIGH){
    holdButt = false;
    if(holdTime > 1000){
      //longPressATime = 5000;
      kozichOn = !kozichOn;
    }
    else
    {      
      shortPressATime = 5000;
    }
    holdTime = 0;
  }

  if(shortPressATime > 0)
  {
    shortPressATime = shortPressATime - diffT;
    /*charlie(koz1);
    charlie(koz2);
    charlie(koz3);
    charlie(koz4);
    charlie(koz5);
    charlie(koz6);
    charlie(cap);
    */
    if(((round(shortPressATime/100) / 2) & 1) == 0){
      charlie(oko1);
      charlie(oko2);
      resetP();
    }
    else{
      
    }
    
  }
  
  /*
  else if (longPressATime > 0)
  { 
    longPressATime = longPressATime - diffT;   
    charlie(oko1);
    delay(5000/longPressATime);
    
    charlie(oko2);
    delay(5000/longPressATime);
    
    longPressATime = longPressATime - 5000/longPressATime;
    longPressATime = longPressATime - 5000/longPressATime;
  }*/
  else{
    if(kozichOn){
      charlie(koz1);
      charlie(koz2);
      charlie(koz3);
      charlie(koz4);
      charlie(koz5);
      charlie(koz6);
      charlie(cap);
    }
    charlie(oko1);
    charlie(oko2);
  }
  resetP();
}

EDIT: Attached the file itself in the meantime i saved like 10bytes by making the reset direct

firecat.ino (3.23 KB)

That does not look the entire sketch. Attach all of it.
Cutting down code sixe to almost half is probably not possible.

The code in charlie() looks like a for() loop candidate. Do the variables need to be 2-byte ints or would 1-byte int8_t work as well?

round() is a floating-point function. You have loaded a big chunk of library code for that one operation.

round(a/100) can be replaced with (a+50)/100 if a is an integer.

Scour your code of all floating-point operations. Even one will inflate your code.

Avoid using any code that requires large libraries. e.g.

Largest offender is likely the following line. This requires floating point math.

if(((round(shortPressATime/100) / 2) & 1) == 0){

Turn this into fixed point math. You can use shift operations for power of 2 divides.

Next one, I would guess is memcpy. Try to copy the data yourself.

memcpy(dst, src, sizeof(src[0])*len);

Using long data types could be next, could be an issue but maybe the C compiler is smart enough for simple add and subtract.

dougp:
The code in charlie() looks like a for() loop candidate. Do the variables need to be 2-byte ints or would 1-byte int8_t work as well?

I just somehow need to differentiate 3 different states LOW, HIGH, NOTCONNECTED(-2 in my case)

MorganS:
round() is a floating-point function. You have loaded a big chunk of library code for that one operation.

round(a/100) can be replaced with (a+50)/100 if a is an integer.

Scour your code of all floating-point operations. Even one will inflate your code.

Huge thanks I fixed it like this and it saved like 600bytes
so Iam at around 1324bytes still around 300 to go.

if(((int(shortPressATime/100) / 2) & 1) == 0){
      charlie(oko1);
      charlie(oko2);
      
    }

Now can you do the same thing to remove all 4-byte (long) integer division operations? Note that the modulo operator (%) hides a division, so you don't save space with that. Instead of dividing by 100, divide by 128, which will be converted to a right-shift by 7 bits. Or don't divide at all: just do the binary-and with 128 or 256 or whatever.

If you never divide a long and only divide short or byte then the long division code isn't loaded.

There are also some compiler options to prioritize code size over speed. I never had to mess with them so I can't tell you how to change them.

Klaus_K:
Avoid using any code that requires large libraries. e.g.

Largest offender is likely the following line. This requires floating point math.

if(((round(shortPressATime/100) / 2) & 1) == 0){

Turn this into fixed point math. You can use shift operations for power of 2 divides.

Next one, I would guess is memcpy. Try to copy the data yourself.

memcpy(dst, src, sizeof(src[0])*len);

Using long data types could be next, could be an issue but maybe the C compiler is smart enough for simple add and subtract.

look like memcpy was optimalized its even better than manual

for (int i = 0; i < len; i++) {
        *dst++ = *src++;
    }

by eliminating the void and doing all of it manually in all 4 palces i got to 1318 bytes to go yaay.

void setup() {
  pinMode(3, INPUT_PULLUP);
}

void charlie(byte a[])
{
  

  
  if(a[0] == 0x0F){
    
    pinMode(5, INPUT);
  }else
  {
    pinMode(5, OUTPUT);
    digitalWrite(5, a[0]);    
  }   
  if(a[1] == 0x0F){
    
    pinMode(6, INPUT);
  }else
  {
    pinMode(6, OUTPUT);
    digitalWrite(6, a[1]);    
  }
  if(a[2] == 0x0F){
    
    pinMode(2, INPUT);
  }else
  {
    pinMode(2, OUTPUT);
    digitalWrite(2, a[2]);    
  }
  if(a[3] == 0x0F){
    
    pinMode(4, INPUT);
  }else
  {
    pinMode(4, OUTPUT);
    digitalWrite(4, a[3]);    
  }
}
/*
void copy(int* src, int* dst, int len) {
    memcpy(dst, src, sizeof(src[0])*len);
    
    
}
*/
/*
void resetP()
{
  pinMode(5, INPUT);
  pinMode(6, INPUT);
  pinMode(2, INPUT);
  pinMode(4, INPUT);  
}
*/

void loop() {

  byte koz1[4] = {0x0F,LOW,HIGH,0x0F};
  //int koz2[4] = {null,HIGH,LOW,null};
  
  byte koz2[4];
  //copy(koz1, koz2, 4);
  memcpy(koz2, koz1, sizeof(koz1[0])*4);
  koz2[1] = HIGH;
  koz2[2] = LOW;
  
  byte koz3[4] = {LOW,HIGH,0x0F,0x0F};
  //int koz4[4] = {HIGH,LOW,2,2};
  byte koz4[4];
  //copy(koz3, koz4, 4);
  memcpy(koz4, koz3, sizeof(koz3[0])*4);
  koz4[0] = HIGH;
  koz4[1] = LOW;

  
  byte koz5[4] = {LOW, 0x0F,0x0F,HIGH};
  //int koz6[4] = {HIGH, 2,2,LOW};
  byte koz6[4];
  //copy(koz5, koz6, 4);
  memcpy(koz6, koz5, sizeof(koz5[0])*4);
  koz4[0] = HIGH;
  koz4[3] = LOW;
  
  byte cap[4] = {0x0F, HIGH,0x0F,LOW};
  
  
  byte oko1[4] = {LOW, 0x0F,HIGH,0x0F};
  //int oko2[4] = {HIGH, 2,LOW,2};
  
  byte oko2[4];
  //copy(oko1, oko2, 4);
  memcpy(oko2, oko1, sizeof(oko1[0])*4);
  oko2[0] = HIGH;
  oko2[2] = LOW;
  
  
  // put your main code here, to run repeatedly:
   
  static long oldTime = 0;  
  long Ctime = millis();  
  long diffT = Ctime - oldTime;
  oldTime = Ctime;


  static bool holdButt = false;
  static long holdTime = 0;
  static long shortPressATime = 0;
  //static long longPressATime = 0;
  static bool kozichOn = true;

  //charlie(oko1);
  //charlie(oko2);
  
  
  if(digitalRead(3) == LOW)
  {
    holdButt = true;
    holdTime = holdTime + diffT;
  }
  else if (holdButt && digitalRead(3) == HIGH){
    holdButt = false;
    if(holdTime > 1000){
      //longPressATime = 5000;
      kozichOn = !kozichOn;
    }
    else
    {      
      shortPressATime = 5000;
    }
    holdTime = 0;
  }



  
  if(shortPressATime > 0)
  {
    shortPressATime = shortPressATime - diffT;
    /*charlie(koz1);
    charlie(koz2);
    charlie(koz3);
    charlie(koz4);
    charlie(koz5);
    charlie(koz6);
    charlie(cap);
    */
    if(((int(shortPressATime/100) / 2) & 1) == 0){
      charlie(oko1);
      charlie(oko2);
      
    }
    else{
      
    }
    
    
    
  }
  
  /*
  else if (longPressATime > 0)
  { 
    longPressATime = longPressATime - diffT;   
    charlie(oko1);
    delay(5000/longPressATime);
    
    charlie(oko2);
    delay(5000/longPressATime);
    
    longPressATime = longPressATime - 5000/longPressATime;
    longPressATime = longPressATime - 5000/longPressATime;
  }*/
  else{
    if(kozichOn){
      charlie(koz1);
      charlie(koz2);
      charlie(koz3);
      charlie(koz4);
      charlie(koz5);
      charlie(koz6);
      charlie(cap);
    }
    charlie(oko1);
    charlie(oko2);
  }
  //resetP();
  pinMode(5, INPUT);
  pinMode(6, INPUT);
  pinMode(2, INPUT);
  pinMode(4, INPUT);
}

I still need to try to reduce the weird thing with modulo etc. and try to simplify some types
so far 1246bytes, at same functionality

Guys you are awesome together with those on arduino stackexchange it looks like I finally achieved it, I will try to optimize little bit more as its 1010bytes 98%. But I thank you all. You teached me in 1 hour more practical things, than I could read in a week.

void setup() {
  pinMode(3, INPUT_PULLUP);
}

void charlie(byte a[])
{
  

  
  if(a[0] == 0x0F){
    
    pinMode(5, INPUT);
  }else
  {
    pinMode(5, OUTPUT);
    digitalWrite(5, a[0]);    
  }   
  if(a[1] == 0x0F){
    
    pinMode(6, INPUT);
  }else
  {
    pinMode(6, OUTPUT);
    digitalWrite(6, a[1]);    
  }
  if(a[2] == 0x0F){
    
    pinMode(2, INPUT);
  }else
  {
    pinMode(2, OUTPUT);
    digitalWrite(2, a[2]);    
  }
  if(a[3] == 0x0F){
    
    pinMode(4, INPUT);
  }else
  {
    pinMode(4, OUTPUT);
    digitalWrite(4, a[3]);    
  }
}
/*
void copy(int* src, int* dst, int len) {
    memcpy(dst, src, sizeof(src[0])*len);
    
    
}
*/
/*
void resetP()
{
  pinMode(5, INPUT);
  pinMode(6, INPUT);
  pinMode(2, INPUT);
  pinMode(4, INPUT);  
}
*/

void loop() {

  byte koz1[4] = {0x0F,LOW,HIGH,0x0F};
  //int koz2[4] = {null,HIGH,LOW,null};
  
  byte koz2[4];
  //copy(koz1, koz2, 4);
  memcpy(koz2, koz1, sizeof(koz1[0])*4);
  koz2[1] = HIGH;
  koz2[2] = LOW;
  
  byte koz3[4] = {LOW,HIGH,0x0F,0x0F};
  //int koz4[4] = {HIGH,LOW,2,2};
  byte koz4[4];
  //copy(koz3, koz4, 4);
  memcpy(koz4, koz3, sizeof(koz3[0])*4);
  koz4[0] = HIGH;
  koz4[1] = LOW;

  
  byte koz5[4] = {LOW, 0x0F,0x0F,HIGH};
  //int koz6[4] = {HIGH, 2,2,LOW};
  byte koz6[4];
  //copy(koz5, koz6, 4);
  memcpy(koz6, koz5, sizeof(koz5[0])*4);
  koz4[0] = HIGH;
  koz4[3] = LOW;
  
  byte cap[4] = {0x0F, HIGH,0x0F,LOW};
  
  
  byte oko1[4] = {LOW, 0x0F,HIGH,0x0F};
  //int oko2[4] = {HIGH, 2,LOW,2};
  
  byte oko2[4];
  //copy(oko1, oko2, 4);
  memcpy(oko2, oko1, sizeof(oko1[0])*4);
  oko2[0] = HIGH;
  oko2[2] = LOW;
  
  
  // put your main code here, to run repeatedly:
   
  static uint16_t oldTime = 0;  
  uint16_t Ctime = millis();  
  uint16_t diffT = Ctime - oldTime;
  oldTime = Ctime;


  static bool holdButt = false;
  static uint16_t holdTime = 0;
  static uint16_t shortPressATime = 0;
  //static long longPressATime = 0;
  static bool kozichOn = true;

  //charlie(oko1);
  //charlie(oko2);
  
  
  if(digitalRead(3) == LOW)
  {
    holdButt = true;
    holdTime = holdTime + diffT;
  }
  else if (holdButt && digitalRead(3) == HIGH){
    holdButt = false;
    if(holdTime > 1000){
      //longPressATime = 5000;
      kozichOn = !kozichOn;
    }
    else
    {      
      shortPressATime = 5000;
    }
    holdTime = 0;
  }



  
  if(shortPressATime > 0)
  {
    shortPressATime = shortPressATime - diffT;
    if(((uint16_t(shortPressATime/128) / 2) & 1) == 0){
      charlie(oko1);
      charlie(oko2);
      
    }
    else{
      
    }
    
    
    
  }
  else{
    if(kozichOn){
      charlie(koz1);
      charlie(koz2);
      charlie(koz3);
      charlie(koz4);
      charlie(koz5);
      charlie(koz6);
      charlie(cap);
    }
    charlie(oko1);
    charlie(oko2);
  }
  pinMode(5, INPUT);
  pinMode(6, INPUT);
  pinMode(2, INPUT);
  pinMode(4, INPUT);
}

If you replace all the pinMode(), digitalWrite() and digitalRead() calls by their respective PIN, PORT and DDR register calls you can save another 100-200 bytes or so.

This complex thing really bugs me:

 if(((uint16_t(shortPressATime/128) / 2) & 1) == 0){

(uint16_t(shortPressATime/128) / 2) is the same as shortPressATime/128/2 is the same as shortPressATime >> 8
Then you do a bitwise and looking at bit 0. So the whole above thing is the exact same as:

if (shortPressATime & 0x100) {

or using a macro, which should result in the exact same code:

if (bitRead(shortPressATime, 8)) {

... which in turn doesn't make much sense to me. What is the importance of that specific bit being set?
Or let me ask that differently: what exactly are you trying to do here, really?

MorganS:
divide by 128, which will be converted to a right-shift by 7 bits.

Not so sure about that. I would think it works like that, but I once tried it with a var/2 vs. var >> 1, and the second compiled smaller.
Now I've seen crazier things thanks to optimisation: sometimes when you add a line or two of functional code, the compiled size goes down!

Replace all the calls to pinMode and all the digitalWrite/digitalRead's with direct port writes/reads. When you get rid of the last call to each of those functions, you'll see a decrease....

Try to avoid division, particularly with large datatypes - there's no hardware divide instruction on avr's, so if you can get rid of all the division by replacing with bitshifts and the like, you can save flash there too.

wvmarle:
If you replace all the pinMode(), digitalWrite() and digitalRead() calls by their respective PIN, PORT and DDR register calls you can save another 100-200 bytes or so.

This complex thing really bugs me:

 if(((uint16_t(shortPressATime/128) / 2) & 1) == 0){

(uint16_t(shortPressATime/128) / 2) is the same as shortPressATime/128/2 is the same as shortPressATime >> 8
Then you do a bitwise and looking at bit 0. So the whole above thing is the exact same as:

if (shortPressATime & 0x100) {

or using a macro, which should result in the exact same code:

if (bitRead(shortPressATime, 8)) {

... which in turn doesn't make much sense to me. What is the importance of that specific bit being set?
Or let me ask that differently: what exactly are you trying to do here, really?

Not so sure about that. I would think it works like that, but I once tried it with a var/2 vs. var >> 1, and the second compiled smaller.
Now I've seen crazier things thanks to optimisation: sometimes when you add a line or two of functional code, the compiled size goes down!

In the way its being ifed to it on each loop it should call the charlie on even 120tieth and on odd not (or other way around i am just glad it somehow worked) and this way basically blink charliplexed leds.

honzapat:
In the way its being ifed to it on each loop it should call the charlie on even 120tieth

Every 128th is easy: every time bits 0-6 are all 0, you have a multiple of 128 (2^7). To test for this, clear bits 8-31 (you're using an unsigned long, 32-bit), then check wether the result is zero:

if ((shortPressATime & 0x007F) == 0) {

and on odd not[/code]
I don't understand what you mean, but to test odd/even you check bit 0.

By checking bit 8 and ignoring the lower bits you will get 256 times a true, then 256 times false on that test, as it's increased one at a time.