In my experience the most important resource is SRAM, is much more probable that this resource get consumed instead of Flash memory.
I selected Due (have 3 of them) becuase of the large memory footprint. However, I've read that the binaries produced by the sketches are some 8 to 10x larger.
This is wrong: if you compile Blink the binary is about 10x size, because the sketch contains Blink + ARM initialization code that is common to all sketches.
If you want to do a more fair benchmark you should start with a plain blink, look at the size, add some instruction, look again at the increased size, and compute the delta.
But this is still a non-fair benchmark, since if the code is optimized for AVR it can compile worse in ARM and vice-versa (for example ARM could do 32bit operation in one instruction while AVR needs 4 operations, but probably moving 1 byte need 1 operation for AVR but could be more than one operation on ARM because of memory alignment etc.)
It's not a simple question.