If you really want to know your Due's pin flipping capability, google "arm bit banding".
I wouldn't expect bit-banding to be any faster than the parallel port "set" and "reset" registers... Either way you can modify a single bit with a single instruction. (now, whether the compiler produces equally good code for both cases is a separate question.)
Any discrepancies you see in the "square wave" shape that are due to the scope probe are also going to be present in any other "wire" that you connect to a pin toggling at that frequency. Putting a ~20MHz signal down a wire is not as trivial as it sounds (consider the old "thick ethernet", which had a 20MHz signal rate, sort of.)