Ok, so this might be a little bit of spastic space optimization to be considering it, but it seems that it should be possible to call a function located in the bootloader.
i.e. there are a few routines in the bootloader for serial communication (getch, putch), that could conceivably be made reusable by application code. There may be other functions that could make up a kind of pseudo bootloader os.
The Pro's? Maybe a K worth of space reclaimed by reuse on average.
The drawbacks? Well application code could be tightly coupled to the bootloader, you have to get the bootloader right the first time, and deployed. ISP would only be used for burning bootloaders and not entire scripts (which is OK imho).
This has probably been discussed thoroughly before, just thinking out loud. I don't really think 1K is worth the effort with the 328 picking up speed, but like to know it was considered, albeit briefly.
I do however see some value in a purpose built bootloader that the IDE can communicate with to reflash the application code, that does not require external tools like avrdude. Having to emulate an avr-isp will set up the expectation that you can reuse it with any avr-isp software and ties you to toolchains and standards. Whereas a custom protocol that the bootloader and ide share makes it black and white, these are a matched set and can be optimized without bounds.
Note also that the term "bootloader" does not necessarily mean you are limited to 2k. Any application can be allowed to modify pages of flash, so that you can put a resident virtual machine on the chip of arbitrary size that reflashes the bytecode, or just installs native hex code. the reflashing part might need to be in the bootloader, but as long as the rest of the VM functions are prepended at predictable locations in the flash then it would be seamless too. There may be significant flash savings with a VM bytecode arrangement, at the cost of some low level performance.
Also the bootloader, if the first point of contact for serial I/O, can also sniff for a pre-arranged sequence of bytes to tell it to go into reprogram mode so that all bootups are immediate. Possibly disable the external reset pin and use it for another i/o pin and have the software decide if it wants to listen for software updates based on the status of that pin (or some other pin). But it wouldn't actually reset if the signal is low (or high), the pin state only enables listening for the "ok, I want to reload the application code" sequence. The application would keep running as normal till it receives that sequence.
Ok, maybe I'm just bored