What is the fastest way to read/write GPIOs on SAMD21 boards?

There are two available busses that can reference the GPIO registers. these are defined as PORT and PORT_IOBUS. PORT_IOBUS provides a fast path that can be accessed while the next instruction is being fetched over the other bus.

The documentation you reference calls a function (or perhaps a macro) port_pin_set_output_level and you would have to find it's definition to see how it is implemented. I cannot find that function in any of the files distributed from Arduino.

I do know that a function that takes the level as an argument is potentially less efficient that one that strictly sets or clears the bit since a different register is used for setting and clearing. What I have implemented is a set of macros one for each pin that indicates the group and bit position. For instance digital pin 4 is bit 7 of the first group:

#define digital_4 (7)

Pins on the second group, I just add 32 to the bit position.

Then I define two macros to decode the group index and bit mask from the earlier constant value for the pin:

#define PBMask(_x) (1 << ((_x)&0x1f))
#define PGrp(_x) ((_x) >> 5)

Then I define macros to set clear or toggle the bit:

#define OutSet(_x) (PORT_IOBUS->Group[PGrp(_x)].OUTSET.reg = PBMask(_x))
#define OutClr(_x) (PORT_IOBUS->Group[PGrp(_x)].OUTCLR.reg = PBMask(_x))
#define OutTgl(_x) (PORT_IOBUS->Group[PGrp(_x)].OUTTGL.reg = PBMask(_x))

So I can set and then clear digital pin 4 with:

OutSet(digital_4);
OutClr(digital_4);

If the value I want to output is a variable, then I have to use this macro:

#define OutPin(_x, _val) if (_val) OutSet(_x); else OutClr(_x)

There are other macros that work similarly for configuration, direction and input. Not everything is accessible via PORT_IOBUS, then PORT must be used instead. This is all significantly more complicated than the AVR microcontrollers!