#define versus const and other software performance issues

There is no "LSL A, 4" instruction. The instruction is "LSL A". One bit shifted per machine instruction.

The swap-plus-bitwise-and is two machine instructions. Bit shifting left four bits is four machine instructions.