Good information. I'll check out the mega bootloader code. Is 512 bytes the maximum size for 328 chips? What about 1284?
I doubt whether it's possible to do a 512byte bootloader with an error-checking protocol.
I've been thinking about this. It really doesn't take much more code to receive a series of smaller blocks and send back an Ack_byte after each block is received. Little more than a loop within the existing page-write loop. Maybe a couple of lines to compute checksum.
Of course, you still have to get the PC end to comply.