It most definitely uses more power to run faster, less power to run slower.
Looking in the ATMEL doc (ATMEL ATmega doc8271.pdf) I see many different power options allowing use of less and less power just by running at lower and lower clock speed. That's not the only way to save power either. You can run at 128kHz internal oscillator or even an external 32kHz watch crystal if desired but don't forget that there is also a clock divider, the pre-scaler which may divide by powers of 2 from 1 to 256.
Factory default is 8MHz oscillator pre-scale divided by 8 = 1MHz, that is how the chips are shipped and they can certainly be ISP'd at that speed.
Your oscillator may be running at 16MHz but unless the default divider was changed, your clock is 2MHz.
Silicone may vary even across the same wafer. Some chips may get away with what others won't. The specs are made with margins in mind. But figure on this -- temperature matters so the edge you may be able to skirt now might just give you unreliable performance later.
9.2.1 Default Clock Source
The device is shipped with internal RC oscillator at 8.0MHz and with the fuse CKDIV8 programmed,
resulting in 1.0MHz system clock. The startup time is set to maximum and time-out
period enabled. (CKSEL = "0010", SUT = "10", CKDIV8 = "0"). The default setting ensures that
all users can make their desired clock source setting using any available programming interface.
9.11 System Clock Prescaler
The ATmega48A/PA/88A/PA/168A/PA/328/P has a system clock prescaler, and the system
clock can be divided by setting the ”CLKPR – Clock Prescale Register” on page 387. This feature
can be used to decrease the system clock frequency and the power consumption when the
requirement for processing power is low. This can be used with all clock source options, and it
will affect the clock frequency of the CPU and all synchronous peripherals. clkI/O, clkADC, clkCPU,
and clkFLASH are divided by a factor as shown in Table 29-12 on page 324.
What the Arduino board needs OTOH is the bootloader on the chip and perhaps oscillator and divider set for 16MHz to run properly?