Searching through some core code I came across the tone lib and especially the function:
void tone(uint8_t _pin, unsigned int frequency, unsigned long duration)
This function tries to find an optimal prescaler for the timer. To do that it calculates the OCR = F_CPU / frequency / 2 / SomePrescaler -1
Because division is expensive I took an extra variable uint32_t ocrRaw = F_CPU / frequency / 2;
which is the repeating part of the formula
and replaced that in the rest of the function.
I tested with ToneKeyboard.pde sample sketch to see the impact on the size.
before patch => 3614 bytes
after patch => 3526 bytes
improvement == 88 bytes!
I did not test speed, I expect it to be slightly faster (no free Arduino nearby to test)