pert:
I very much disagree. If we could get it down to 2 kB, we could add support for more microcontrollers. It would also free up memory for the user application on the already supported parts. 2 kB extra flash on a chip with only 32 kB total is a big deal! Even if we can't get it small enough to fit in the next smaller boot section, getting the code smaller leaves room to add more features to the bootloader. In the case of avr_boot, that could be serial support.
Of course I'm not saying I expect you to do that, but only that I do think it would be worthwhile for someone to do it.
I keep coming back to this, but then end up with nowhere to go. I spent a good bit of time looking at avr_boot, but since I never became proficient in C, much of it makes no sense to me. But I wonder why the code is so much bigger than by MSP430 version. It could be because it's AVR8, and that just takes more code, or because it's mostly C instead of assembler, although I'm told C is pretty efficient, or because of the need to support so many processors and Arduinos, although it seems that would all be taken care of in the build process, or, my favorite, because avr_boot still makes considerable use of libraries, like for FAT, and inevitably the functions that end up being included do more than you really need, or do it less efficiently.
If getting the size under 2K is important, to me that just cries out for doing it in assembler, using no libraries, but I take your point that it then becomes an opaque blob of code for most people, so not much use to anyone else. However, I don't know how many people would be successful wading their way through Optiboot, even in C.
Attached is a schematic showing modifications to the Arduino, and to the widely available level-shifting microSD module, which would let the bootloader detect the presence of an SD card without sending commands and waiting for a response, and without using another pin. It basically works like my TI version, with a little jumping through hoops to avoid sourcing 5V to the SD card. I guess you wouldn't approve of requiring such modifications, and I would probably agree.
Anyway, I guess the problem for me is that I don't really know C++ or AVR8 assembler, so I don't know what I would do to go forward with this. All I know is the steps needed to deal with SD and SDHC, FAT16 and FAT32, which I think is what should be supported, not MMC or FAT12 anymore, or SDXC. I should also say that my version of this has no capabilty to navigate cluster chains, which makes the code a lot shorter, but does introduce certain risks, particularly concerning the root directory in FAT32. And my TI version also requires the HEX file to be pre-converted to binary so the bootloader doesn't have to parse a hex file. So it may be that my version is smaller because it does a lot less.