Just a newby asking the 64k question again - Arduino Mega2560

Hi all and thanks in advance,

For the past week I have been scouring the web and this site with great interest in trying to answer the perennial question - can you store an array in PROGMEM that is larger than 64k bytes on the Mega2560?

My application is that I want to store a dictionary of 10,000 words in PROGMEM statically through compilation in the usual way prescribed here and in other places:

#include <avr/pgmspace.h>
const char* a_dict PROGMEM = "a";
const char* aa_dict PROGMEM ="aa";
const char* aaa_dict PROGMEM ="aaa";
const char* aaron_dict  PROGMEM ="aaron";
const char* ab_dict PROGMEM ="ab";
const char* abandoned_dict PROGMEM ="abandoned";
const char* abc_dict PROGMEM ="abc";
//.
//.
//.
 //etc  10,000 lines of words.  The total size of all the characters+null termination bytes in all the words is 75.5k. ...then...

const char* dictionary[] PROGMEM = {a_dict, aa_dict,aaa_dict,aaron_dict,ab_dict,abandoned_dict,abc_dict // 10000 char* vars worth
};
//then some way to access the dictionary yet to be decided
void setup(){}

void loop(){
char* x; //a place holder for a recovered string
int index; //some sort of index
//something like strcpy_PF(x, (char*)pgm_read_word_far(&(dictionary[index])));
// yadda yadda, do something with x 
}

The problem I run into is the thing will not compile. I get an assembler error saying that "Error: value of 68731 is too large to fit into 2 bytes at xyz" which of course is true for any number bigger than 2^16

So I'm thinking there must be some secret incantation to get it to be understood that I need addressing to somehow be far/long/etc...somehow at least 1 bit more than 16 bits of addressing. It looks like by reading the various .h files and .cpp files in the avr libs that there is a register called RAMPZ that is used in the extension of the addressing to allow addressing more than 64K of flash. But how to get that to happen?

In all my reading I find a lot of info on how to access data placed above the 64k boundary - but nobody says how to get the compiler to put it up there in the first place. Is there a parameter? Is there a typedef I'm missing? Or - does the compiler just simply do it for you when you set things up correctly.

Should I divide up my dictionary of words into 2 pieces, each less than 64k? Then perhaps I could do something like attribute((section(".someothersection"))); to get it to go in some other place?

Any guidance would be muchly appreciated.

Quite humbly,
Joe

Do what the phone companies used to do. Divide your dictionary into two parts A-L and M-Z

I don't think there is any "secret incantation" that changes pointer addressing from 2 to 4 bytes
for the Arduino.

I don't believe this 256K FLASH memory can be used to store and retrieve data and is meant to be used for executable code only.

EDIT:

An near Arduino compatible such as the chipKIT MAX 32 may be more useful for your application.

Microchip® PIC32MX795F512 processor
80 Mhz 32-bit MIPS
512K Flash, 128K RAM

http://www.digilentinc.com/Products/Detail.cfm?NavPath=2,892,894&Prod=CHIPKIT-MAX32

I haven't used it yet, but plan on it as my app is quite large.
I found people mentioning a certain library, hunt around for:

Carlos Lamas' morepgmspace.h

If you have some luck, report back.

I don't believe this 256K FLASH memory can be used to store and retrieve data and is meant to be used for executable code only.

People can and do use the higher end of flash for PGM data, keeping the program code in the lower region of flash is far more efficient than having PROGMEM data push it into the higher region requiring an extra level of indirection.

I learned alonggggg time ago saying something can't be done brings those with useful information out of the woodwork ...

1 Like

Since you are only slightly over the limit you might want to try storing the dictionary as a Trie.

Based on my experiments with using large arrays in FLASH, I found I could not get them to work correctly past the 64KB mark on even a mega1280 chip. Was told there is a bug in the older version of the gcc compiler that the arduino distrubution uses, but that newer versions have fixed, but I don't know that for a fact, just what's been posted around here.

Anyway here is a simple testing sketch you are free to play with. By changing one constant, arraysize, you can try and create a sketch of any size you wish. I was able to fill up a 328P and a 644P chip to near capacity, no problem. But cannot get sketch to run properly if the compiled sketch crosses the 64KB mark on the 1280 chip, and I don't have a 2560 board to play with. Symptom on the 'too large' sketches is it compiles without error and appears to upload completely but the blink portion of the sketch never executes.

#include <avr/pgmspace.h>   //To store arrays into flash rather then SRAM
// Simple sketch to create large sketch sizes for testing purposes
/*
  Blink
  Turns on an LED on for one second, then off for one second, repeatedly.
 
  This example code is in the public domain.
 */
 
// Pin 13 has an LED connected on most Arduino boards.
// give it a name:
int led = 13;

/* 
 Make arraysize = to 1500 for 328P chip, 4000 for 1280P chip?,
 3600 for 644P chip, xxxx for 1284P,  etc.
*/
const int arraysize= 1500;  // value to mostly fill available flash capacity

long myInts0[arraysize] PROGMEM = {};  //Store initialized array into flash memory
long myInts1[arraysize] PROGMEM = {};
long myInts2[arraysize] PROGMEM = {};
long myInts3[arraysize] PROGMEM = {};

// the setup routine runs once when you press reset:
void setup() {                
  // initialize the digital pin as an output.
  pinMode(led, OUTPUT); 
  int i = random(0,arraysize);      // Work around any optimization for constant values
  Serial.print(myInts0[i]);         //  Access some random element so the array can't be optimized away.
  Serial.print(myInts1[i]);         //  Access some random element so the array can't be optimized away.
  Serial.print(myInts2[i]);         //  Access some random element so the array can't be optimized away.
  Serial.print(myInts3[i]);         //  Access some random element so the array can't be optimized away.
}

// the loop routine runs over and over again forever:
void loop() {
  
  digitalWrite(led, HIGH);   // turn the LED on (HIGH is the voltage level)
  delay(1000);               // wait for a second
  digitalWrite(led, LOW);    // turn the LED off by making the voltage LOW
  delay(1000);               // wait for a second
}

Hi and thank you all for the suggestions.

The suggestion to look into "morepgmspace.h" led to the AVR C runtime lib website and Carlos's code. It looks very instructive, though it is mostly runtime stuff (well, sure, of course). Now I don't know enough about this platform to understand if perhaps that code suggests you can load the FLASH dynamically - I thot that wasn't possible. Or perhaps the way I should read it is that I need to somehow write my declaration code so it looks like it's being dynamically written, but actually, it's being compiled and linked and uploaded like any normal sketch.

There is actually a bug fix update/change to pgmspace that was uploaded as recently as a few days ago. In that version, as opposed to the one in the 1.0.3 distribution, they extern a whole bunch of long address functions. They also deprecate a whole load of types.

Hemmerling's page says that for chips that have over 64k of flash - that the normal 16bit pointer scheme (plus RAMPZ, etc) allows addressing to 128k in 2 64k chunks.

I did indeed try to split my

const char* array[] {};

into two separate chunks, though it still thinks I'm trying to address flash beyond 64k with only the 16bit ptr, and the assembler balks.

I'm wondering if there isn't a define somewhere that I'm not taking advantage of. For instance - could it be that selecting "Mega2560 or Mega ADK" somehow isn't getting the right defines set somewhere so it isn't taking advantage of RAMPZ, or perhaps I'm just declaring stuff in a way that's braindead.

Most likely, I'm doing something braindead, and I expect there's a simple newbie mistake in my process that once I figure out will have me slapping my forehead into the wall... Discussion of how to address and read flash over 64k is everywhere. But how to write it at compile/link/load time is not.

That makes me think it should just happen easily, and I'm missing something obvious (as usual).

Thanks for the code snippets. I will try all avenues and report back if I have a breakthrough.

Best
Joe

This looks like a winner.

pgm_read_byte is functionally, pgm_read_byte_near.

What we need is 'pgm_read_byte_far' which accepts a 32-bit pointer allowing us to access the second half of flash. ( 64K words = 128K bytes ).

This macro below translates a PGM (16 bit ) address to a far PGM ( 24-bit, stored as a 32-bit ) address.

So basically, use GET_FAR_ADDRESS to 'get' the address, then use pgm_read_byte_far to read the data.

All the normal *_P functions only support near addresses, you'll have to provide your own custom overloads if needed.

People have had success with this, so hope it helps.

#include <inttypes.h> 
#include <avr/io.h> 
#include <avr/pgmspace.h> 
... 
prog_char MyString[] = "Hello World"; 
... 

#define GET_FAR_ADDRESS(var)                          \ 
({                                                    \ 
    uint_farptr_t tmp;                                \ 
                                                      \ 
    __asm__ __volatile__(                             \ 
                                                      \ 
            "ldi    %A0, lo8(%1)"           "\n\t"    \ 
            "ldi    %B0, hi8(%1)"           "\n\t"    \ 
            "ldi    %C0, hh8(%1)"           "\n\t"    \ 
            "clr    %D0"                    "\n\t"    \ 
        :                                             \ 
            "=d" (tmp)                                \ 
        :                                             \ 
            "p"  (&(var))                             \ 
    );                                                \ 
    tmp;                                              \ 
}) 

void UART0_puts_P(uint_farptr_t str) 
//--------------------------------------------------- 
// send a ProgMem string to UART0 Transmit-buffer 
//--------------------------------------------------- 
{ 
  u08 c= pgm_read_byte_far(str); 
  while (c) 
  { 
    UART0_putc(c); 
    c=pgm_read_byte_far(++str);    
  } 
} 
... 
int main(void) 
{ 
  ... 
  UART0_puts_P(GET_FAR_ADDRESS(MyString)); 
  while(1); 
}

pYro_65:
This looks like a winner.

pgm_read_byte is functionally, pgm_read_byte_near.

What we need is 'pgm_read_byte_far' which accepts a 32-bit pointer allowing us to access the second half of flash. ( 64K words = 128K bytes ).

This macro below translates a PGM (16 bit ) address to a far PGM ( 24-bit, stored as a 32-bit ) address.

So basically, use GET_FAR_ADDRESS to 'get' the address, then use pgm_read_byte_far to read the data.

All the normal *_P functions only support near addresses, you'll have to provide your own custom overloads if needed.

People have had success with this, so hope it helps.

#include <inttypes.h> 

#include <avr/io.h>
#include <avr/pgmspace.h>
...
prog_char MyString[] = "Hello World";
...

#define GET_FAR_ADDRESS(var)                          \
({                                                    \
   uint_farptr_t tmp;                                \
                                                     \
   asm volatile(                             \
                                                     \
           "ldi    %A0, lo8(%1)"           "\n\t"    \
           "ldi    %B0, hi8(%1)"           "\n\t"    \
           "ldi    %C0, hh8(%1)"           "\n\t"    \
           "clr    %D0"                    "\n\t"    \
       :                                             \
           "=d" (tmp)                                \
       :                                             \
           "p"  (&(var))                             \
   );                                                \
   tmp;                                              \
})

void UART0_puts_P(uint_farptr_t str)
//---------------------------------------------------
// send a ProgMem string to UART0 Transmit-buffer
//---------------------------------------------------
{
 u08 c= pgm_read_byte_far(str);
 while (c)
 {
   UART0_putc(c);
   c=pgm_read_byte_far(++str);    
 }
}
...
int main(void)
{
 ...
 UART0_puts_P(GET_FAR_ADDRESS(MyString));
 while(1);
}

So how can this 'solution' be implemented in context of writing and uploading sketches in the Arduino IDE?

Lefty

Thanks for the code, Pyro.
That seems to definitely work to access the data in Flash.

The issue I seem to be facing from the outset happens where the declaration is:

prog_char abc[] = "hello world";

In my code, I am trying to create 10,000 prog_char var[]s. The size of all the strings in 10,000 declarations combined is 75k. There would have to be at least 17 bits of addressing to accomodate that. Fortunately, it appears from literally all of the documentation that it is possible to do this on the ATMega2560 chip set through the use of the RAMPZ register.

However, when I try to compile/link the code that contains the 10,000 lines of declarations, each of which is unique and looks like - prog_char foo[] = "bar"; -
the compiler/linker stage balks in the Arduino IDE. The error I get is that those declarations are too large to fit in 2 bytes of addressing.

Well, sure. It's just math and we all know you can't address 75K of data with only 16 bits. But there should be flags set in the appropriate places (like --relax in the linker) and defines for the compiler that make sure the addressing is set up correctly.

The issue I'm having is getting the data into the high memory in the first place. I'm absolutely certain that once it is there, we can use various functionality suggested kindly here to read it.

Thanks so much for putting up with my blather.

Joe

iceowl:
In all my reading I find a lot of info on how to access data placed above the 64k boundary - but nobody says how to get the compiler to put it up there in the first place. Is there a parameter? Is there a typedef I'm missing? Or - does the compiler just simply do it for you when you set things up correctly.

I've run into this problem and I'm not SURE, but it seems like it happens whenever a PROGMEM access has to CROSS a 64K boundary. I wasn't able to find out for sure because to test it I would need "offending" code and "offending" code won't compile...... :slight_smile:

What I've done was to, by trial and error, find where my stuff was crossing a 64K boundary (I think), then use a dummy variable to use up the rest of the first 64K so that real variables were in the next 64K, and so on...

Sadly, I'm not even sure if I'm on the right track... all I know it I got it to work this way.

Though slightly inconvenient that sounds vastly rational. In any case the dictionary I am trying to store is a static resource, so getting it set up correctly a-priori is no problem. Once it is set up, I won't change it. Hopefully in the future this will only get easier.
I will try it.
Joe

Ok, an experiments with Lefty's code. I generated some exhaustive initialization just to see what would happen.

#include <avr/pgmspace.h>   //To store arrays into flash rather then SRAM
// Simple sketch to create large sketch sizes for testing purposes
/*
  Blink
  Turns on an LED on for one second, then off for one second, repeatedly.
 
  This example code is in the public domain.
 */
 
// Pin 13 has an LED connected on most Arduino boards.
// give it a name:
int led = 13;

/* 
 Make arraysize = to 1500 for 328P chip, 4000 for 1280P chip?,
 3600 for 644P chip, xxxx for 1284P,  etc.
*/
const int arraysize= 3000;  // value to mostly fill available flash capacity

long myInts0[arraysize] PROGMEM = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,... // up to 2999
long myInts1[arraysize] PROGMEM =  {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,
30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,... // up to 2999
//...
//...up to
long myInts9[arraysize]PROGMEM={//etc

void setup() {                
  // initialize the digital pin as an output.
  pinMode(led, OUTPUT); 
  int i = random(0,arraysize);      // Work around any optimization for constant values
  Serial.print(myInts0[i]);         //  Access some random element so the array can't be optimized away.
  Serial.print(myInts1[i]);         //  Access some random element so the array can't be optimized away.
  Serial.print(myInts2[i]);         //  Access some random element so the array can't be optimized away.
  Serial.print(myInts3[i]);         //  Access some random element so the array can't be optimized away.
}

// the loop routine runs over and over again forever:
void loop() {
  
  digitalWrite(led, HIGH);   // turn the LED on (HIGH is the voltage level)
  delay(1000);               // wait for a second
  digitalWrite(led, LOW);    // turn the LED off by making the voltage LOW
  delay(1000);               // wait for a second
  
}

What I've noticed is that if the sketch size goes over 128K I start getting errors of the form:

warning: internal error: out of range error...

That is, if I add a an additional array of [arraysize] longs where arraysize = 3000, bringing the number up to 11 arrays, each of 3000 long elements = 4113000 = 132K bytes (then add the rest of the sketch ) I start getting that out of range error called on various libs. For instance the first place it shows up is as an out of range error on the do_random func random.o. If I comment out the call to random, it shows up in Hardware Serial.

If I stick to 10 initialized arrays the sketch size is 124,996 ( = 4103000 + rest of sketch). No problems compiling/linking.

Now interestingly. suppose instead of long int I initialize each of the myInts array to 3000 4 byte null terminated strings thus:

char* myInts9[arraysize] PROGMEM ={"abc\0","abc\0","abc\0",//..etc for 3000 initializers

For 10 initialized arrays of 3000 four-byte-strings the IDE reports a sketch size of 64,646 bytes out of a 258,048 byte maximum. The same sketch initialized to longs is reported as 124,996.

For 20 initialized arrays of 3000 four-byte strings the IDE reports

arduino-1.0.3\hardware\arduino\cores\arduino/main.cpp:11: warning: internal error: out of range error

Binary sketch size: 136,682 bytes (of a 258,048 byte maximum)

If I take out 1 array I get no errors and
Binary sketch size: 130,646 bytes (of a 258,048 byte maximum)

I have not yet tried to load and run these sketches.

Cheers,
Joe

It doesn't totally surprise me you are having these problems. My experience has been that the Mega side of the platform, being used less, hasn't received as much attention (eg. bootloaders that don't handle the watchdog timer).

Then people compiling large arrays (in itself perfectly sensible) are probably in a minority too.

It would be interesting if, when you solve this, we document it so others can benefit from it.

Meanwhile you could always consider my suggestion of trying to store your dictionary more compactly and make the problem go away. :slight_smile:

Thanks Nick.
Yes, absolutely, a more intelligent data structure would be more compact and faster too.
Of course, even then there is the bald-faced challenge of trying to use up the whole 256k of the Mega....and once solved I'll also use tries and get even more packed in there!!!
Cheers
Joe

So how can this 'solution' be implemented in context of writing and uploading sketches in the Arduino IDE?

Lefty

Copy 'n' paste.

You cannot address huge variables even in the < 64k boundary.
Here is a program using every single byte on the mega using progmem, it is possible, and GET_FAR_ADDRESS will help you read the data.

#define nothing

template< uint64_t C, typename T >
  struct LargeStruct{
    T Data;
    LargeStruct< C - 1, T > Next;
};
template< typename T > struct LargeStruct< 0, T >{ };

typedef LargeStruct< 80, uint64_t > Container; //640 bytes

PROGMEM LargeStruct< 50, Container > l_Struct;  //32k
PROGMEM LargeStruct< 50, Container > l_Struct1;   //32k
PROGMEM LargeStruct< 50, Container > l_Struct2;   //32k
PROGMEM LargeStruct< 50, Container > l_Struct3;   //32k
PROGMEM LargeStruct< 50, Container > l_Struct4;   //32k
PROGMEM LargeStruct< 50, Container > l_Struct5;  //32k
PROGMEM LargeStruct< 50, Container > l_Struct6;   //32k
PROGMEM LargeStruct< 50, Container > l_Struct7;   //32k

PROGMEM LargeStruct< 431, uint16_t > l_Struct8; //862 bytes
void setup()
  {
    volatile int i = ( int ) &l_Struct;
    volatile int i1 = ( int ) &l_Struct1;
    volatile int i2 = ( int ) &l_Struct2;
    volatile int i3 = ( int ) &l_Struct3;
    volatile int i4 = ( int ) &l_Struct4;
    volatile int i5 = ( int ) &l_Struct5;
    volatile int i6 = ( int ) &l_Struct6;
    volatile int i7 = ( int ) &l_Struct7;    
    volatile int i8 = ( int ) &l_Struct8;  
  }

void loop(){}

pYro_65:

So how can this 'solution' be implemented in context of writing and uploading sketches in the Arduino IDE?

Lefty

Copy 'n' paste.

You cannot address huge variables even in the < 64k boundary.
Here is a program using every single byte on the mega using progmem, it is possible, and GET_FAR_ADDRESS will help you read the data.

#define nothing

template< uint64_t C, typename T >
 struct LargeStruct{
   T Data;
   LargeStruct< C - 1, T > Next;
};
template< typename T > struct LargeStruct< 0, T >{ };

typedef LargeStruct< 80, uint64_t > Container; //640 bytes

PROGMEM LargeStruct< 50, Container > l_Struct;  //32k
PROGMEM LargeStruct< 50, Container > l_Struct1;   //32k
PROGMEM LargeStruct< 50, Container > l_Struct2;   //32k
PROGMEM LargeStruct< 50, Container > l_Struct3;   //32k
PROGMEM LargeStruct< 50, Container > l_Struct4;   //32k
PROGMEM LargeStruct< 50, Container > l_Struct5;  //32k
PROGMEM LargeStruct< 50, Container > l_Struct6;   //32k
PROGMEM LargeStruct< 50, Container > l_Struct7;   //32k

PROGMEM LargeStruct< 431, uint16_t > l_Struct8; //862 bytes
void setup()
 {
   volatile int i = ( int ) &l_Struct;
   volatile int i1 = ( int ) &l_Struct1;
   volatile int i2 = ( int ) &l_Struct2;
   volatile int i3 = ( int ) &l_Struct3;
   volatile int i4 = ( int ) &l_Struct4;
   volatile int i5 = ( int ) &l_Struct5;
   volatile int i6 = ( int ) &l_Struct6;
   volatile int i7 = ( int ) &l_Struct7;    
   volatile int i8 = ( int ) &l_Struct8;  
 }

void loop(){}

So I added a blink function to your example sketch after removing some of the 'structure stuff' so as to fit in a 1280 chip and uploaded to a mega.

No compile errors, compile size 98,726 of 130,048 maximum, upload proceeds with no errors, but certainly takes a while. When done no blinking led13? Why does the sketch not run? That is the same symptom I was seeing in my code example posted earlier, I could create sketches of desirable size but after a certain size the blink in loop() doesn't execute?

#define nothing

template< uint64_t C, typename T >
  struct LargeStruct{
    T Data;
    LargeStruct< C - 1, T > Next;
};
template< typename T > struct LargeStruct< 0, T >{ };

typedef LargeStruct< 80, uint64_t > Container; //640 bytes

PROGMEM LargeStruct< 50, Container > l_Struct;  //32k
PROGMEM LargeStruct< 50, Container > l_Struct1;   //32k
PROGMEM LargeStruct< 50, Container > l_Struct2;   //32k
/*PROGMEM LargeStruct< 50, Container > l_Struct3;   //32k
PROGMEM LargeStruct< 50, Container > l_Struct4;   //32k
PROGMEM LargeStruct< 50, Container > l_Struct5;  //32k
PROGMEM LargeStruct< 50, Container > l_Struct6;   //32k
PROGMEM LargeStruct< 50, Container > l_Struct7;   //32k
*/
PROGMEM LargeStruct< 431, uint16_t > l_Struct8; //862 bytes
int led = 13;
void setup()
  {
    pinMode(led, OUTPUT);
    volatile int i = ( int ) &l_Struct;
    volatile int i1 = ( int ) &l_Struct1;
    volatile int i2 = ( int ) &l_Struct2;
 /*   volatile int i3 = ( int ) &l_Struct3;
    volatile int i4 = ( int ) &l_Struct4;
    volatile int i5 = ( int ) &l_Struct5;
    volatile int i6 = ( int ) &l_Struct6;
    volatile int i7 = ( int ) &l_Struct7;    
    volatile int i8 = ( int ) &l_Struct8;
  */  
  }

void loop(){

 digitalWrite(led, HIGH);   // turn the LED on (HIGH is the voltage level)
  delay(1000);               // wait for a second
  digitalWrite(led, LOW);    // turn the LED off by making the voltage LOW
  delay(1000);               // wait for a second
}

Lefty

Good find, I didn't try anything like that. I previously wasn't able to answer the questions as that sketch is a bit obscure, but as I need pgm functionality, I've done a little investigating to work out what is happening.

Firstly, the structures placed into progmem are considered first, before the functions. So as a consequence, the function's code ( main, loop, pinMode, etc... ) is placed in a region not accessible by conventional 16-bit pointers. Therefore to call this code you have to jump to the trampoline, which in turn contains a jump to your code.

From what I was able to ingest, functions that exist in the >64K word boundary automatically have an entry in a thing called a 'trampoline'. It is a table of jumps to locations in the higher memory, allowing the entire memory range to be used.

Also as I understand it, if main is in high memory, it will still be called via the trampoline. The problem with this sketch is, not that the 'far' code isn't being called.
Its just that the arduino libraries do not expect their PGM data to be out of range.

There are a number of ways to get things working. For instance you could update the core to use far pointers where necessary, or the easiest quick fix is to place the structure data after the functions so the core can have full use of the lower address range, however there is an overhead to using high range access, critical stuff should remain in the lower section.

This macro will place data after other sections. Found here

#define PROGMEM_FAR  __attribute__((section(".fini7")))
#define nothing

template< uint64_t C, typename T >
  struct LargeStruct{
    T Data;
    LargeStruct< C - 1, T > Next;
};
template< typename T > struct LargeStruct< 0, T >{ };

typedef LargeStruct< 80, uint64_t > Container; //640 bytes

#define PROGMEM_FAR  __attribute__((section(".fini7")))

PROGMEM_FAR LargeStruct< 50, Container > l_Struct;  //32k
PROGMEM_FAR LargeStruct< 50, Container > l_Struct1;   //32k
PROGMEM_FAR LargeStruct< 50, Container > l_Struct2;   //32k
PROGMEM_FAR LargeStruct< 431, uint16_t > l_Struct8; //862 bytes

int led = 13;

void setup()
  {
    pinMode(led, OUTPUT);
    volatile int i = ( int ) &l_Struct;
    volatile int i1 = ( int ) &l_Struct1;
    volatile int i2 = ( int ) &l_Struct2;
  }

void loop(){

 digitalWrite(led, HIGH);   // turn the LED on (HIGH is the voltage level)
  delay(1000);               // wait for a second
  digitalWrite(led, LOW);    // turn the LED off by making the voltage LOW
  delay(1000);               // wait for a second
}

My problem appears to be this problem:

..it looks like the version of g++ used by Arduino will fail whenever the global constructors get pushed beyond the 64k limit, because the global constructor table is only 16bits wide and the code uses ijmp to access it...

Found here: http://code.google.com/p/arduino/issues/detail?id=1067

I'm compiling avr-gcc-4.7.2 right now just for grins, and I will try building outside the Arduino IDE environment to see if I can get it to load.

Ah, I had long forgotten the joys of code.