PROGMEM string_table[ ] limited to 65536 bytes

I am using the following program, testing how much I will be able to save in PROGMEM on an Arduino Mega 2560:



// initially I set up 824 PROGMEM strings and then a string table linking to them  -  I have excluded most of them for easier reading.



const char string_1[] PROGMEM = "This is example of long string number 1ARROOCHICHA ARROOCHICHA ARROOCHICHACHA";
const char string_2[] PROGMEM = "This is example of long string number 2ARROOCHICHA ARROOCHICHA ARROOCHICHACHA";
//............................ All numbers in between included in here...............................
const char string_824[] PROGMEM = "This is example of long string number 824ARROOCHICHA ARROOCHICHA ARROOCHICHACHA";



PGM_P const string_table[] PROGMEM = 
{
    string_1,
    string_2,
//............................ All numbers in between included in here...............................
    string_824
};


void setup() {
  Serial.begin(9600); // open the serial port at 9600 bps:
}

void loop() {
    char buffer[78];
            for (unsigned long i = 0; i < 999; i++){
               strcpy_P(buffer, (PGM_P)pgm_read_word(&(string_table[i])));
               Serial.print(buffer);
               Serial.print('\n');
}
}

The problem is that it no longer compiles at this point as the number of bytes stored in PROGMEM at this point would be above 65536.

It reports:

"C:....\Temp\cckwAwy9.s: Assembler messages:
C:....\Temp\cckwAwy9.s:4699: Error: value of 65734 too large for field of 2 bytes at 65812
C:....\Temp\cckwAwy9.s:4700: Error: value of 65656 too large for field of 2 bytes at 65814
C:....\Temp\cckwAwy9.s:4701: Error: value of 65578 too large for field of 2 bytes at 65816"

Can anyone tell me why this is not compiling correctly at this point, and if I will be able to extend it to greater PROGMEM, or if PROGMEM is limited to 65536 bytes?

The full program for anyone interested is here:

You need to be using the "far" address space.
https://gist.github.com/Koepel/2a4ae36ae6397cd260f2c7ae2a50610e

PROGMEM that exceeds 65536 bytes is a problem because the address pointer is 16 bits, so the compiler cannot handle references outside that range.

You have to be particularly careful, because as you approach the limit, other data that is stored in PROGMEM and is coded for near PROGMEM addresses will go out of range (data such as the mapping between Arduino pin numbers and the hardware pins of the processor, text stored with the F() macro, etc).

Storing data in upper PROGMEM (beyond 65536 bytes) is also not as straightforward as using PROGMEM, you have to explicitly tell the compiler to store the data in the upper section of flash memory. Creating a table of pointers to char arrays is also problematic, because the actual address is not know by the compiler, only by the linker, and has to be calculated at run-time using the pgm_get_far_address() function, which prevents storing the table in PROGMEM. There are ways to get around this, but it can get a bit convoluted.

Thanks for the information in your answer :-).

I assume then, that I have to work in 64kb banks? Do I have to keep track of the bank numbers, or can I rely on useFarData() and pgm_get_far_address() to do that for me?

Can you explain what you are wanting to store in PROGMEM that is so large?
In your example, all the text strings are nearly the same length, it is less complicated to use a two-dimensional array than a separate table of char*, but that is very wasteful if the strings vary in length. There is also a limit of 32K bytes for a single array, making it necessary to split the data up into multiple arrays.

I am hoping to have some 13 digit numbers and a string of 40 digits of text for each entry of a 2048 database. I could potentially store them in a 53 digit string, each.

All of my usage of these long strings can potentially be the same length, I can handle the difference with endmarkers in the string interpreter if they are not as long.

So I suppose I only want (13+40)*2048 bytes really. 108544 bytes.

To start with, the compiler has to be told to store the data in the upper section of flash memory. I usually use a define for this, so I don't have to type in the attribute ever time it is needed:

#define PROGMEM_FAR __attribute__((section(".fini7")))

Each string of text will need 54 bytes, to allow space for the terminating null. This would allow a maximum of 606 elements to an array, because the compiler limits the size of arrays on an AVR processor to 32767 bytes. Since you have 2048 entries, four arrays of 512 entries will be easiest to code. Each array could be defined as follows:

const char stringTable0[512][54] PROGMEM_FAR = {
  "1234567890123_ABCDEFGHIJKLMNOPRSTUVWXYZ_ABCDEFGH:0000",
  "1234567890123_ABCDEFGHIJKLMNOPRSTUVWXYZ_ABCDEFGH:0001",

Actually accessing the data in the arrays is slightly different from accessing a normal array. The compiler is not able to properly index into an array in far PROGMEM, instead you have to manually calculate the offset from the beginning of the array. This involves getting the address where the array starts, then adding to that the element number multiplied by the size (in bytes) of an element.
Once you have that, the string can be copied into a buffer and used however you like. I've used memcpy_PF() because it works for any type of data, although strcpy_PF() can also be used. Note the _PF instead of _P, to donate that the copy is from far PROGMEM.

uint_farptr_t stringTablePtr[4]; //addresses of arrays in far PROGMEM
size_t stringSize = sizeof(stringTable0[0]); //get size of individual text string

void setup() {
  Serial.begin(9600); // open the serial port at 9600 bps:

  //get actual address of arrays in PROGMEM
  stringTablePtr[0] = pgm_get_far_address(stringTable0);
  stringTablePtr[1] = pgm_get_far_address(stringTable1);
  stringTablePtr[2] = pgm_get_far_address(stringTable2);
  stringTablePtr[3] = pgm_get_far_address(stringTable3);
}

void loop() {
  char buffer[stringSize];
  for (size_t i = 0; i < 2048; i++) {
    //offset of element within array has to be calculated then added to base address of array
    memcpy_PF(buffer, stringTablePtr[i/512] + stringSize * (i % 512), stringSize);
    Serial.print(buffer);
    Serial.print(' ');
    Serial.print(i);
    Serial.print('\n');
  }
}

It is possibly to print the string directly from far PROGMEM, without the use of a buffer, by reading and printing one character at a time, if you really want to conserve RAM and have long strings.

I've attached a full example as an .ino file, its too long to post directly.
progmemtest.ino (119.1 KB)

1 Like

See also Issues · arduino/ArduinoCore-avr · GitHub

Do you have to use a Mega ? For one application, because of exactly that problem of using PROGMEM for large data structures, I moved it from a Mega to an ATmega4809 based Nano Every instead. Here, constants are automatically assigned to flash by the compiler without the user have to explicitly use PROGMEM. IN the meantime that application is now ona an ESP32.

Thank you for a very thorough and helpful reply. That works great.

Well, they apparently need more than 64k of data. A 4809 won't work, and AFAIK no AVR (including the 128k AVR-Dx chips) does that magic mapping with any more than 48k.

Oh yes. I had to look back to see my struggles with the ATMEGA2560. It was a simulation of a primitive 8 bit machine of the "Ben Eater" type which had four 8K ROMS to hold the microcode so it was exactly 32K. I'd forgotten that the ATmega4809 unified address space was limited to 48K but anyway I then moved the whole thing to an ESP32.

Would you know off hand, what I need to change in this program to do the same thing with bytes 0 to 255, not strings, storing bytes? I had a go here:

#define PROGMEM_FAR __attribute__((section(".fini7")))

const int byteTable0[2][54] PROGMEM_FAR = {{
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,
31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53},
{2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,
31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54}
};

uint_farptr_t byteTablePtr[1]; //addresses of arrays in far PROGMEM


void setup() {
  Serial.begin(9600); // open the serial port at 9600 bps:

  //get actual address of arrays in PROGMEM
  byteTablePtr[0] = pgm_get_far_address(byteTable0);

}

void loop() {
  char buffer[1];
  for (size_t i = 1; i < 100; i++) {
    //offset of element within array has to be calculated then added to base address of array
    memcpy_PF(buffer, byteTablePtr[i/512] + 54 * (i % 512), 1);
int number= buffer-'0';
    Serial.print(number);
    Serial.print(' ');
    Serial.print(i);
    Serial.print('\n');
  }
}

I’ll reply tonight, too much to type into a phone.

For a table that size, I would just use regular PROGMEM methods, it will fit in the lower 64k of flash memory.

Thanks, yeah, I'll still be using around 230,400 bytes, and I will have them in packets of 60 bytes (a 60 byte array) at a time. They are in 53 bytes packets in the example I posted today.

I just posted a smaller program that would use the same logic as a bigger program, that I thought might have been easier to reply about.

To start with, if you are storing bytes (values limited to 0 - 255), then use a byte array, not an int array.

This is how I would do it with only a single array, which limits the array to a total of 32767 bytes:

#define PROGMEM_FAR __attribute__((section(".fini7")))

const byte byteTable0[2][54] PROGMEM_FAR = {{
    1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
    31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53
  },
  { 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
    31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54
  }
};

uint_farptr_t byteTablePtr; //addresses of arrays in far PROGMEM

void setup() {
  Serial.begin(9600); // open the serial port at 9600 bps:

  //get actual address of arrays in PROGMEM
  byteTablePtr = pgm_get_far_address(byteTable0);

}

void loop() {
  const size_t recordSize = sizeof(byteTable0[0]);
  const size_t numberOfRecords = sizeof(byteTable0)/sizeof(byteTable0[0]);
  byte buffer[recordSize];
  for (size_t i = 0; i < numberOfRecords; i++) {
    //offset of element within array has to be calculated then added to base address of array
    memcpy_PF(buffer, byteTablePtr + i * recordSize, recordSize);
    for (size_t b = 0; b < recordSize; b++){
      if (buffer[b] < 10)
        Serial.print(' ');
      Serial.print(buffer[b]);
      Serial.print(',');
    }
    Serial.print("\n\n");
  }
  delay(10000);
}

You could read the individual bytes from PROGMEM instead of copying to a buffer, all depends on the intended use of the data as to which is easier.

That much would require multiple arrays, similar to the first example. 230,400 total bytes worth of 60-byte segments would divide nicely into eight equal sized arrays of 28800 bytes each, so it could be split up similar to my original example.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.