GIGA R1 RAM size discrepancy

Hello there,

Here is an incredibly dumb question by a total noob.

static const byte newByteArray[800000] = {
  0,0,0,0,0,0,0,0,0,0,  
  etc...
  etc...
  0,0,0,0,0,0,0,0,0,0  
};

void setup() {}

void loop() {}

I have a very large array I need to statically define (hardcode about 800kbytes).
No, the array cannot be dynamically generated by code (it's a mapping table with unique values).

The table must be loaded in user SRAM.

This array must occupy almost the entire RAM of the GIGA R1 (1 Mbyte of RAM: 192 Kbytes of TCM RAM (inc. 64 Kbytes of ITCM RAM + 128 Kbytes of DTCM RAM for time critical routines), 864 Kbytes of user SRAM, and 4 Kbytes of SRAM in Backup domain)

When I build the above masterpiece example, this is what I get:

Sketch uses 110680 bytes (5%) of program storage space. Maximum is 1966080 bytes.
Global variables use 47520 bytes (9%) of dynamic memory, leaving 476104 bytes for local variables. Maximum is 523624 bytes.

I would expect that the sketch use about 800k flash as well as about 800k dynamic memory.

And then, also, what truly throws me off, is this:

Maximum is 523624 bytes

I'm surely doing two mistakes, first, why is the storage/dynamic memory so low, and secondly, why is Arduino IDE reporting only 523624 bytes max, when it should be at least about 850k.

Thank you for reading my wonders. Any insight is highly appreciated.

Cheers!
Valentine

Because you don't access newByteArray anywhere in your sketch
the compiler deems it unnecessary and deletes it from the compiled code.
Do something with it like printing one of the array members.

1 Like

Print a random element of the array, if you print a specific element the compiler is usually capable of optimizing the code to that single fixed value.

I'm not familiar with the Giga, why is it necessary to store a const array in SRAM?

1 Like

Embedded real-time random-access database.

The real problem is a lot more complex.

Cheers,
Valentine

Any idea why Arduino doesn't see/report the full 1MB?

Cheers!
Valentine

You probably need to remove "const", or the compiler will put the array in flash rather than RAM.

1 Like

Thank you. The compiler doesn't even get to that point, the footprint never exceeds 10% of the total space.

Any idea why we see only 500k of the H7 RAM in Arduino?

Cheers,
Valentine

I modified my sketch, same effect. Added this:

void setup() {
  Serial.begin(9600);
  Serial.println(newByteArray[0]);
  Serial.println(newByteArray[9999]);
  Serial.println(newByteArray[99999]);
  }

Also tried with different array types, random numbers, etc. Nothing.

Any ideas please?

Cheers,
Valentine

I'm talking to myself here, hopefully anyone in future will see this and find it helpful.

Apparently when selecting GIGA board as target, Arduino does certain behind the curtain optimizations which place the array in flash and I haven't ran speed tests but it probably retrieves the values from the flash instead of ram. That's a no-go for serious work as user has no control over certain aspects of edge cases, which sadly is why anyone would use a giga board.

When selecting bare H747 target, you have the options to control optimizations. When selecting no optimization, the array is correctly placed in ram. There is some hope there, however, I'm very uncomfortable. This will require more testing before I select Arduino as a dev environment for my project.

My second question about not having access to the full 850k, apparently Arduino pre-selects how much RAM to be visible to each core, and allocates that to the target core. I have no granular control over this, which is very disappointing and obviates the biggest advantage of a dual core design. Perhaps there is a way to control this, and I'm not aware of it?

When I targeted the M4 core, and then M7 core, the total visible max memory was really close to the 850k as I expected. Which makes a lot of sense since the rest is probably occupied by bootloader, etc.

Moral of the story, the giga board is awesome however Arduino needs to give users a lot more control over the end to end process else many of the real advantages are lost. Again, perhaps the controls are there and I'm not aware of this. I'm pretty new to Arduino.

I will continue experimenting with Arduino however I will gravitate towards CubeIDE and my custom designed H7 where I have full control over the entire chain.

My guess is Portenta H7 probably has the same issues, I need to get one and give it a spin, but my expectations at that point are pretty low already.

Cheers!
Valentine

If you want the compiler to put the array in ram, do not declare it “const”. On processors with a single address space, the compiler tends to place constant data into the flash memory.

1 Like

Thank you, I already corrected this. It didn't help when I selected GIGA board as a target, only worked when I targeted a bare H7 with "no optimization" option.

I see a rather fundamental paradigm conflict where Arduino is supposed to make things easy, and H7 dual core targets a very different subset of use cases which are outside the Arduino envelope. There is some overlap however the venn diagram is more of a pair of crossed eyes than an egg-white with a yolk.

Cheers,
Valentine

I keep talking to myself.

I found where the RAM allocations are, and yes, they could be changed. However this required a series of modifications across multiple files in the Arduino GIGA deployment and I'm not comfortable since this is not part of the project but rather part of the UI, then board target definitions as well as all dependent compiler and linker scripts :face_with_spiral_eyes:. Any time I need to make a change I have a really high chance of making a mistake deeply buries in some discombobulated text files all over. This also means I cannot have two projects with different memory allocations, only one at a time. Also this didn't solve the bigger problem of placing the array in RAM. I could place the array in flash and malloc in ram then memcopy but that's like picking my nose with my toe. This is a hard no go decision.

Well it's a good educational exercise, that's all. Back to Cube IDE and my own board I guess.

Cheers,
Valentine

PS please don't do that it will create a problem, this was only an academic exercise you cannot just reallocate the memory regions freely.

But those are still constants, and the compiler can optimize them away.
The compiler is very smart!

Try something like:

#define ASIZE 10000
int myArray[ASIZE];

void setup() {
  Serial.begin(9600);
  for (int i = 0; i < ASIZE; i++) {
    myArray[i] = random(12345678);
  }
  Serial.print(myArray[random(ASIZE)]);
}
1 Like

Thank you.

I already did earlier but didn't post the code, still somehow didn't work. Wonders.

No worry, I'll do in in Cube, there you have very fine / granular control, but I really wanted to do this in Arduino. Bummer. Giga has build in qspi sdram and other things I need, so that board is really good for my project.

Even if I manage to shoehorn it into ram I'm concerned by the lack of control over build time optimization, as this apparently changes on the fly which leads to a very non-reproducible chain.

Cheers,
Valentine

PS This is an illustration of what I wanted in Arduino... Alas.

https://blog.embeddedexpert.io/?p=2167

I have no idea what you're all talking about :wink: I'm not a Giga user and your knowledge is far deeper than mine.

You actually can if you use two (or more) portable installs of IDE 1.x; each portable install can have its own configuration.

It's still needed to do that configuration that you're talking about and hacking core (board package) files has the risk that an update of the board package reverses all those changes.

Yeah, you won't be able to use custom linker scripts in Arduino. At least, not easily.

I'm not sure how that's related to what's been discussed here.

I'm also not a Giga user.

A look at the reference manual for the STM32H747XI seems to show that the "1M RAM" is not "contiguous" - there is a 512k chunk ("AXI RAM") on the M7 bus, a 288k chunk on the m4 bus (SRAM1-3), a second SRAM chunk, and the DTCM an ITCM chunks. The buses are fancy and interconnected, so the M4 and M7 can both access all of the memory, but I think the chances that you can have a single data structure that exceeds 512k is almost nil (Cube won't help), and Arduino is probably taking the conservative route of only allowing that memory to be used for variables (maybe it's different if you have explicit M4 code?)

(It looks like Teensy4 has 1M of contiguous RAM, if you're willing to consider alternatives. I don't understand (and you haven't explained) why your table needs to be in RAM.)

Note: it is hard to also get 1mb of contiguous RAM on a teensy 4.x as well.

As has been mentioned, I don't believe you can have that large of an array in
RAM memory. But you can always try to verify this, by doing something like:

static const byte newByteArray[800000] = {
  0,0,0,0,0,0,0,0,0,0,  
  etc...
  etc...
  0,0,0,0,0,0,0,0,0,0  
};

byte newByteArrayRAM[800000];

void setup() {
   memcpy( newByteArrayRAM, newByteArray, 800000);
}

int loop_count = 0;
void loop() {
  // do something to access memory.
  loop_count++;
  if (loop_count == 800000) loop_count = 0;
  if (newByteArrayRAM[loop_count] != loop_count) Serial.print(loop_count);
  
}

And see if this would actually build or not.

LOL, yeah, that's possible, actually I did it with two VM containers out of spite, because I got really mad yesterday. Also, Arduino UI and Windows in general hide a lot of extra stuff in hidden user directories, not exactly very safe. However this is not a serious approach, just a "here I did it" ego trip.

Cheers,
Valentine