bug in optiboot bootloader

bperrybap: [it's not an avrdude bug] because even when a full chip erase is not done, you still have to erase a page before you write to the page, and when you erase the page it erases that page to all 0xff.

I'm not convinced; the self-programming documentation says:

If only a part of the page needs to be changed, the rest of the page must be stored (for example in the temporary page buffer) before the erase, and then be rewritten.

There is no exception called out for the previous contents being 0xFF...

They are saying that because the entire page is set to 0xFF. Therefore if you want to change part of a page, you save all of it, erase it back to 0xFF, and then copy the wanted stuff back (your new stuff, plus any old stuff from the temporary buffer).

the datasheet for the atmega328 does suggest not writing blocks which are FF

Considerations for Efficient Programming The loaded command and address are retained in the device during programming. For efficient programming, the following should be considered. • The command needs only be loaded once when writing or reading multiple memory locations. • Skip writing the data value 0xFF, that is the contents of the entire EEPROM (unless the EESAVE Fuse is programmed) and Flash after a Chip Erase.

granted, it doesnt say for just page erases, but FLASH only works by being set to FF when erased. this is why it needs to be erased, and not just written. writing can only chang a 1 to a 0.

I think I prefer a patch along the lines of:

      for (bufPtr = buff; bufPtr < buff+SPM_PAGESIZE; bufPtr++)
	  *bufPtr = 0xFF;

      // While that is going on, read in page contents
      bufPtr = buff;

This makes the initialization extremely obvious, doesn’t rely on previous contents, and seems to end up somewhat shorter than a smarter loop. The code can probably spare the 64 us…

It seems to be a bit academic. avrdude DOES seem to write pages with trailing FF’s; It’s only the pages completely filled with FF that are skipped. Here’s a test sketch:

#include <pgmspace.h>

typedef struct {
  unsigned char data[SPM_PAGESIZE];
} 
flashpage_t  __attribute__ ((aligned(128)));

extern const PROGMEM flashpage_t allff, leadff, trailff;

void setup() {
  // put your setup code here, to run once:
  digitalWrite(1, pgm_read_byte(&allff.data[100]));
  digitalWrite(1, pgm_read_byte(&leadff.data[100]));
  digitalWrite(1, pgm_read_byte(&trailff.data[100]));
}

void loop() {
  // put your main code here, to run repeatedly: 
}
#include <arduino.h>
#include <pgmspace.h>

typedef struct {unsigned char data[SPM_PAGESIZE];} flashpage_t  __attribute__ ((aligned(128)));
extern const PROGMEM flashpage_t allff, leadff, trailff;

#define FF8 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF
#define FF16 FF8, FF8
#define FF64 FF16, FF16, FF16, FF16

const PROGMEM flashpage_t allff = {{ FF64, FF64 }};

const PROGMEM flashpage_t leadff = {{ FF64, FF16, FF16,
 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31}};

const PROGMEM flashpage_t trailff = {{ 
  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
 FF16, FF16, FF64 }};

(include both files as tabs in your sketch.)

And here’s the avrdude verbose output (from 1.0.2. Has it changed?)

######avrdude: Send: U [55] . [c0] . [00]   [20] 
avrdude: Recv: . [14] 
avrdude: Recv: . [10] 
avrdude: Send: d [64] . [00] . [80] F [46] . [00] . [01] . [02] . [03] . [04] . [05] . [06] . [07] . [08] . [09] . [0a] . [0b] . [0c] . [0d] . [0e] . [0f] . [10] . [11] . [12] . [13] . [14] . [15] . [16] . [17] . [18] . [19] . [1a] . [1b] . [1c] . [1d] . [1e] . [1f] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff] . [ff]   [20] 
avrdude: Recv: . [14]

Did you really compile those files with the IDE?? :wink:
Attached is the full output using avrdude 5.11SVN which has the new code to track
and skip burning unused pages in the HEX image.

Not sure what your are looking for.

I do remember lots of discussion about this type of speed up a few years ago over on avrfreaks.
There was a desire to speed it up, I seem to recall that there were some issues and the code
to look at empty pages went in and out as there were some issues (more than once).
I’m not sure how it all ended up.
I do know that there is now code that tracks all the empty pages.
It is supposed to be smart enough to track empty pages in HEX image vs pages with 0xff in them.
If it is skipping over valid pages in the HEX image with full 0xff in them when -D is used,
then my guess is that most people don’t use -D since it is slower than page by page erase
and it got overlooked and is a bug.

— bill

avrdude.out (44.8 KB)

Did you really compile those files with the IDE??

Yep. It did seem to be the easiest way to make sure that the parameters and environment all matched the actual bug report. They were short; not to painful to generate! The hard part was figuring out the syntax of that attribute

It looks like the latest avrdude code does everything right, including the page containing nothing but 0xFF's; At least the one in the middle.

The recent AVRDUDE speedup comes from only VERIFYING the code that was in the original .hex file. Older code would notice that when you burnt a bootloader (for instance), the top of your programmed are was 7FFF, and would then verify all the pages between 0 and 7FFF. The new code is supposed to notice that you only programmed from 7E00 to 7F00 (because those were the only pages in the .hex file), and only verifies that (small) amount of memory. I think it will only program pages that are actually PRESENT in the .hex file, which could be a big advantage for "holey" hex files. (The old code assumed that a page that contained only FFFF was empty and didn't upload it, which was the source of this "bootloader misbehavior.")

I should flush out the test sketch so that it actually tells you something useful when you run it, rather than relying on inspection of the upload log...

I asked about the compile because it didn't compile for me. I had to modify the code not to declare the typedef twice.

BTW, the new avrdude only burns and verifies the actual pages with real data in them. The burn/verify time for optiboot using USBASP is less than 1 second now. I've given out dos versions to a few folks that are still using windows and were burning lots of bootloaders.

--- bill

using progmem, and storing blocks of data in FLASH is different, and does not act the same as storing variables. progmem puts the data block at the beginning of the code section, so it is not a set of trailing FF's, at least as far as the entire .hex file is concerned. so entire pages, and the ends of pages with FF, will be written if they are not the last section of the .hex file. variables are written to the end of the .hex file, and full pages and the ends of pages will FF will not be written.

If it is ONLY the block at the very end of the sketch, that's a much less serious problem. IMO.

Huh. How do you even GET trailing 0xFFs in the .data segment? I'm always getting .data from the Arduino libraries beyond everything I put in the user area of the sketch.

Libraries?

008001be d _ZL8mystdout
00800200 D trailff_RAM               ;; structure with trailing FFs
00800280 V _ZTV14HardwareSerial      ;; structure from libraries.
00800290 B __bss_start
00800290 D __data_end

the new avrdude only burns and verifies the actual pages with real data in them. The burn/verify time for optiboot using USBASP is less than 1 second now.

Where we can get that latest avrdude?

Here for example (6.0rc2):

http://www.mikrocontroller.net/topic/296379#3166155

pito: Where we can get that latest avrdude?

You will have to build it yourself. From the repository: http://savannah.nongnu.org/svn/?group=avrdude There are also some tar'ed up source versions available: http://download.savannah.gnu.org/releases/avrdude/

Just get and build whatever version you want for your OS platform. note: there are quite a few tools that must be in place to be able to do native s/w development and for those still using Windows, it isn't quite as easy to get them in place as those running linux OS's.

--- bill

westfw: If it is ONLY the block at the very end of the sketch, that's a much less serious problem. IMO.

this definitely means it will effect fewer people, but also means it will be far more confusing when it does happen. the partial page trailing FF's seems like an easy fix anyways. not sure anything can be done on the optiboot side for full pages at the end of a .hex file.

g_u_e_s_t:

westfw: If it is ONLY the block at the very end of the sketch, that's a much less serious problem. IMO.

this definitely means it will effect fewer people, but also means it will be far more confusing when it does happen. the partial page trailing FF's seems like an easy fix anyways. not sure anything can be done on the optiboot side for full pages at the end of a .hex file.

Couldn't you put a dummy section at the end of the linked image to ensure that this never happened? Yeah, it would require a tweak to the Arduino team controlled code but it's a one time thing.

bperrybap: Couldn't you put a dummy section at the end of the linked image to ensure that this never happened? Yeah, it would require a tweak to the Arduino team controlled code but it's a one time thing.

that would probably work. i am essentially doing that manually with my code by adding a single, non-FF variable after all the FF ones.

westfw:
Huh. How do you even GET trailing 0xFFs in the .data segment?
I’m always getting .data from the Arduino libraries beyond everything I put in the user area of the sketch.

Libraries?

008001be d _ZL8mystdout

00800200 D trailff_RAM              ;; structure with trailing FFs
00800280 V _ZTV14HardwareSerial      ;; structure from libraries.
00800290 B __bss_start
00800290 D __data_end

im not sure i understand the question, but if i do this:

int buff[8] = {0xffff, 0xffff, 0xffff, 0xffff, 0xffff, 0xffff, 0xffff, 0xffff};

void setup() {}

void loop() {
  for (int i = 0; i < 8; i++) {
    PORTD = buff[i];
  }
}

i get this:

sketch_aug03a.cpp.elf:     file format elf32-avr

Contents of section .data:
 800100 ffffffff ffffffff ffffffff ffffffff

which is put at the end of the .hex file

// top snipped off
:1001D000816080838081806880831092C10008954F
:0401E000F894FFCFC1
:1001E400FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF1B
:00000001FF

bperrybap: Couldn't you put a dummy section at the end of the linked image to ensure that this never happened? Yeah, it would require a tweak to the Arduino team controlled code but it's a one time thing.

another nice thing about this method, is that people could just update their arduino ide, rather than burning a new bootloader.

I guess that one of the reasons that this doesn’t seem to cause many problems is that if you use any arduino library that has .data, it will show up AFTER the user variables and prevent the problem…

(Since avrdude does blocks with trailing FF bytes in the middle of the sketch, I’m not sure why it has decided that it doesn’t need to do things that way at the end of the sketch…)

Here’s the sketch…

test_ff_pages.zip (1.75 KB)

For what it's worth, this appears to be the most relevant bug report (Status: Invalid)... http://savannah.nongnu.org/bugs/?func=detailitem&item_id=30061

@g_u_e_s_t (and anyone else who wants the problem fixed at the root), please add your 2¢ to that report.