Here is a sample sketch that generates a 25kHz PWM signal on Pin 3. You set the duty cycle by loading OCR2B. Don't change OCR2A as it sets the pulse rate. Your duty cycle is set by loading 0 thru 79 into OCR2B. Even at 0%, the CPU generates a 500nS "glitch" pulse. At 100% duty cycle, it is glitchless.
const int PWMPin = 3;
void setup() {
// generate 25kHz PWM pulse rate on Pin 3
pinMode(PWMPin, OUTPUT); // OCR2B sets duty cycle
// Set up Fast PWM on Pin 3
TCCR2A = 0x23; // COM2B1, WGM21, WGM20
// Set prescaler
TCCR2B = 0x0A; // WGM21, Prescaler = /8
// Set TOP and initialize duty cycle to zero(0)
OCR2A = 79; // TOP DO NOT CHANGE, SETS PWM PULSE RATE
OCR2B = 0; // duty cycle for Pin 3 (0-79) generates 1 500nS pulse even when 0 :(
}
void loop() {
unsigned int x;
// ramp up fan speed by increasing duty cycle every 200mS, takes 16 seconds
for(x = 0; x < 80; x++) {
OCR2B = x; // set duty cycle
delay(200);
}
}
EDIT: Oops, fixed comments. This is pretty much the best way I know to do it. You load one field to change the duty cycle whenever you want. No blocking, no glitches from library or application code running with interrupts disabled. Absolutely zero overhead for the CPU. Even Timer0 can't muck it up.

Code improvements welcome.
One big con though, even when set to 0, it generates a 500nS pulse at 25kHz. That's the way the CPU works and the only to get a true 0% duty cycle is to disable the timer or set the pin to input mode, but that will make the fan run at full speed.