fungus:
If the chip has a hardware multiplier which takes two clock cycles...why doesn't the compiler use it to do shift operations instead of creating a loop of single-bit shifts?
I don't know. Possibly we don't have the latest version of the compiler. Possibly that particular optimization was omitted from the code generation section.