I agree with retrolefty, I think it's the missing lock bits. No lock bits = boot loader section not protected from being overwritten. Actually I'm surprised it can overwrite itself with a new sketch before stopping to work.
I'm by no means an expert in this myself. I've only done this once too (on some atmega 328's, one of which I think I bricked), and its half a years since already. I made some notes then though. Basically, fuse and lock bits settings for an atmega 328P at 16Mz with an external resonator, SPI programming enabled and a 2 kB (1k word) boot loader is:
L fuse: 0xFF
H fuse: 0xDA
E fuse: 0xFD
And the boot loader lock bits: 0xCF
No guarantee of the correctness, though.
As for the different efuse, I think it's the same thing, the 5 first bits are not used (0x05 vs 0xFD, also see table 25-6,page 296 of the atmega 44/88/168/328 datasheet). But it might have something to say with verify errors (which I got, but it seems to work nevertheless). Something to do with "1" being "unprogrammed", and "0" being "programmed". And probably it reads back "0" for unused bits (but I'm not sure about that).
While I'm at it, I'd like to recommend the modified ADABOOT: http://www.wulfden.org/TheShoppe/freeduino/ADABOOT.shtml
It's really quick, starts sketched almost instantly. And its for atmega 168/328 and 644 (I've only tested 328 so far, but I'm going to test 644 soonish).