Have I made this hardware SPI transfer as fast as possible?

This. It shaves 62.5 nS off the gap (as you expect) and the data is rubbish: