What is the fastest way to read/write GPIOs on SAMD21 boards?

Now there's an idea! Byte access! Slight reduction in code length, and not really that difficult if you are willing to modify CMSIS/device/ATMEL/SAMD21/Include/component/port.h (not really recommended!).

They types for the port registers are already defined as unions, so I just added uint8_t breg[4] to the existing uint32_t reg. It works using breg, for instance your example would be:

PORT->Group[0].OUTSET.breg[21 >> 3] = (1 << (21 & 0x07)); 

It generates the same code you just posted.