Arduino Due digital read speed

Hi,
take a look at my solution at Parallel <--> Byte value - Arduino Due - Arduino Forum(Reply #9).

There's a schemata for input and output within about 0.4 µs. Surely you can make it a little bit faster. And instead of 8 Bit size you can - with some limitations - use 16 ore more bits without significant longer time because the bus has a size of 4 bytes.
Tom