I have just put ATMega328P-AU (brand new - no bootloader, no program) on PCB, added 16 MHz crystal, ISP header and successfully programmed "blink" example it using AVRISP mkII (used Uno and Duemillanove 328 board profiles). To my surprise LED blink frequency was very small. Using stopwatch I came to conclusion that 328 is running at 1 MHz instead of expected 16 Mhz.
The solution I use is first burn the standard bootloader, then upload your program (whether or not it uses the bootloader).