Program size difference with different enum types

Hi,

I'm working on a code to run threads. But as I was trying to remove unnecessary stuff from the code I noticed something interesting.

the program size is different when compiling the code with these three enum configurations:

typedef enum: uint8_t {NOT_FINISHED, FINISHED}STATE; // flash used = 4840
typedef enum {NOT_FINISHED, FINISHED}STATE;               // flash used = 4868
typedef enum: bool {NOT_FINISHED, FINISHED}STATE; // flash used = 4906

I thought with bool, size should be smaller !

I know it doesn't matter at the end since the elements of the enum aren't having values over 1.

Another question:

Does this:

#ifdef __cplusplus
    extern "C" {
#endif

.
.
.

#ifdef __cplusplus
    }
#endif

Conflict with using something related to a line that has C++ feature like:

typedef enum: uint8_t {NOT_FINISHED, FINISHED}STATE;

As specifying the type of enum is C++ feature as I learned, if the information I learned is correct.

The language standard calls for just two bool values: false and true. I assume the extra code ensures anything enum: bool is converted to false / true before the value is used.

2 Likes

And the default type for an enum is int rather than a single byte so that’s why the uint8_t version is shorter ((compares will be on one byte too which is faster on a 8 bit microcontroller)

extern "C" is about function linkage , read the answer here What is the effect of extern "C" in C++? - Stack Overflow and more generally about compilation
https://www.agner.org/optimize/calling_conventions.pdf

1 Like

Memory is stored byte aligned so there's no such thing as a 1 bit wide boolean data type. You can however pack 8 enums with a 0 or 1 value into a single 8 bit wide integer.

enum SomeFlag {
	Off = 0,
	On = 1
};

struct MyStruct {
		enum SomeFlag firstFlag:1;
		enum SomeFlag secondFlag:1;
		enum SomeFlag thirdFlag:1;
		enum SomeFlag fourthFlag:1;
		enum SomeFlag fifthFlag:1;
		enum SomeFlag sixthFlag:1;
		enum SomeFlag seventhFlag:1;
		enum SomeFlag eightFlag:1;
};
1 Like

A typedef does not use program memory, the rest of the code does. There might be conversions in the code, or the compiler optimizations work better with certain code.
If you give a full sketch and tell for which board, then we can really test it.

@nicolajna, the compiler needs extra code to read and write a specific bit. The code might become larger.

1 Like

Thanks for the replies.

Here's my current project which is based on lcd128x64 and task_manager. Running on a nano board.

The code is in task_manager.h, line: 17.

I also would really appreciate it, if you check my project and provide me with some pointers.
But note that I'm programming in C code.

And this is the .ino code:

#include "task_manager.h"
#include "glcd_spi.h"
#include "arrays.h"

extern STATE lcd_st_flag;

void setup() {
  Serial.begin(9600);
  ///////////////////////////////////// LCD THREAD /////////////////////////////////////
  // allocate & initialize lcd thread
  THREAD *lcd = (THREAD*)malloc(1*sizeof(THREAD));
  TASK *lcd_tsk = (TASK*)malloc(4*sizeof(TASK)); // method #1

  // lcd thread initialization
  *lcd = (THREAD){4,0,NOT_FINISHED,&lcd_st_flag};
  
  // lcd task initialization
  *(lcd_tsk+0) = (TASK){0,(uint8_t*)0,(void(*)())glcd_init};
  *(lcd_tsk+1) = (TASK){1,(uint8_t*)1,(void(*)())glcd_graphics_mode};
  *(lcd_tsk+2) = (TASK){0,(uint8_t*)0,(void(*)())glcd_clr};
  *(lcd_tsk+3) = (TASK){1,(uint8_t*)&heart,(void(*)())glcd_img};//PIC3
  
  while(lcd->thrd_st != FINISHED){
    run_thread(lcd, lcd_tsk);
  }

  ///////////////////////////////////// SERIAL THREAD /////////////////////////////////////  

  Serial.println("thread finished");
  
}

void loop() {

}

glcd_spi.cpp (7.5 KB) glcd_spi.h (2.5 KB) task_manager.cpp (2.9 KB) task_manager.h (1.0 KB)

I think that @Coding_Badly has a good point. There are so many casts with those values of that enum to uint8_t, the compiler might do a cascade of two or three unneeded casts every time such a value is read or written.

Sometimes when I try to force the compiler to use a certain variable, then it still can do a calculation with a larger variable. I did not rewrite the code for 'bool' only without the casts, but that should reduce the code size.

Sorry, I have no pointers. The taskmanager is too hard dot understand for me at first glance.

@Koepel That's true. For some reason I convinced myself that we were talking about ram usage.

Thanks for the replies, really appreciate it.

Even I tried this one thinking it may do something:

typedef enum: bool __attribute__((__packed__)){NOT_FINISHED, FINISHED} STATE;

And got 4902 flash size

With this original line:

typedef enum: uint8_t {NOT_FINISHED, FINISHED} STATE;

It's 4836

@Koepel, the casts are because I used void pointers. So I have to cast everything, and the reward I wanted is to get one source for anything I want, which provided me a bit abstract code.

@nicolajna yes the difference is in both flash and ram size.

I don't know for some reason I removed the casts for structs initializations in .ino code and didn't get a problem !

  // lcd task initialization
  *(lcd_tsk+0) = (TASK){0, 0, glcd_init};
  *(lcd_tsk+1) = (TASK){1, 1, glcd_graphics_mode};
  *(lcd_tsk+2) = (TASK){0, 0, glcd_clr};
  *(lcd_tsk+3) = (TASK){1, &PIC3, glcd_img};

Which was:

  // lcd task initialization
  *(lcd_tsk+0) = (TASK){0, (uint8_t*)0, (void(*)())glcd_init};
  *(lcd_tsk+1) = (TASK){1, (uint8_t*)1, (void(*)())glcd_graphics_mode};
  *(lcd_tsk+2) = (TASK){0, (uint8_t*)0, (void(*)())glcd_clr};
  *(lcd_tsk+3) = (TASK){1, (uint8_t*)&heart, (void(*)())glcd_img};

So thanks for noting me about this one :slight_smile:

Another thing in casting those structs, is that two worked without casting and other ones required the cast:

  // lcd task initialization
  *(lcd_tsk+0) =       {0, 0, glcd_init};
  *(lcd_tsk+1) = (TASK){1, 1, glcd_graphics_mode};
  *(lcd_tsk+2) =       {0, 0, glcd_clr};
  *(lcd_tsk+3) = (TASK){1, &p1, glcd_img};

Could you tell me the reason and which is best is to cast all 4 structs or put casts on the ones the only require ?

To me personally, at least after removing a lot of unnecessary casts on the original version and left with casting the structs is ok rather casting the ones that is a must.

I found a bug here, that dereferencing a struct wasn't correct and even didn't work.

			*((STATE*)thrd->thrd_st) = FINISHED;

Replaced it with this one and worked just fine:

			thrd->thrd_st = FINISHED;

@Koepel and I are discussing casts generated out of your view by the compiler. Specifically integral casts.

Is only relevant to struct and union.

1 Like

or union

I see them as the same. A way to lay out fields in memory.

But, that view doesn't help someone new to the language. Thank you.

they are very different in how the memory is allocated though...

struct {
  bool a;
  int b;
}

will have space for both a and b whereas

union {
  bool a;
  int b;
}

will only have space for one of the two (basically just enough bytes for b)....

but I guess that was not your point and you know that for sure. lost in translation probably

1 Like

That TaskManager project seems overly complicated for what it does. If all you want to do is execute functions at a specified interval in a cooperative manner, it can be achieved much easier.

I've been using forms of the same scheduler for years. It's fast, easy to extend, dead simple and enough for most projects.

struct Task {
    uint32_t interval;
    uint32_t lastInvoke;
    void (*handle)(void);
};

static void task1(void) {
    // called every 1000 ticks
}

static void task2(void) {
    // called every 375 ticks
}

struct Task tasks[] = {
    {
        1000, 0, task1
    },
    {
        375, 0, task2
    }
};

#define NUMBER_OF_TASKS (sizeof(tasks) / sizeof(struct Task))

int main(void)
{

    // initialize stuff here

    while (1) {
        for (uint8_t i = 0; i < NUMBER_OF_TASKS; i++) {
            struct Task *t = &tasks[i];
            
            if (TICK_GetTicks() - t->lastInvoke >= t->interval) {
                t->handle();
                t->lastInvoke += t->interval;
            }
        }
    }
    
    return 1;
}

It obviously need to be "arduinofied" somewhat. Everything is also statically allocated, which I personally prefer on resource constrained systems.

1 Like

Yes, I know this method and it was my first method into multitasking and did it in college projects couple years ago.

My my new task manager has a new feature which is running different threads where each thread has one or more tasks to run.

I want to implement two main features now:

  1. run tasks in sequential mode, as there might by some tasks need to be run one after the other.
  2. run tasks in round robin

Feature #1 is working now, but I need to improve the casting in the task_manager.cpp which is not so complicated but has a lot of casting which I want to minimize, but I need casting now because I'm working with pointers.

Feature #2 shouldn't be so difficult to do, I just have to develop a main time_scheduler function that manage everything else.

Next improvements would be of course cooperative or preemptive.

Hi,

I want to know with the new update of the website, what is the good way to end the thread ?

Should edit the thread title with [SOLVED] prefix ? or something else or leave it as is ?

1 Like

Let time pass (4 months; see the message below).

Ask a moderator to lock the thread (though we'd be inclined not to do that).

That's always a good choice.

1 Like

Yes I know editing the title with [SOLVED] prefix is ok.

Thanks for the tip.

1 Like