Any modulo operation with a power of two can be optimized by the compiler to a simple AND operation which is much faster than a division (the Atmel 8bit instruction set doesn't have a division operator, so a modulo is a sequence of subtractions) and in this case probably even more important: it's not timing constant (it doesn't need the same time to execute for different input values).