Micros function implementation

So at the moment, the Arduino Due I am using has micros() specified as the following in the wiring.c file:

uint32_t micros( void )
{
    uint32_t ticks ;
    uint32_t count ;

    SysTick->CTRL;
    do {
        ticks = SysTick->VAL;
        count = GetTickCount();
    } while (SysTick->CTRL & SysTick_CTRL_COUNTFLAG_Msk);

    return count * 1000 + (SysTick->LOAD + 1 - ticks) / (SystemCoreClock/1000000) ;
}

What is the purpose of calling SysTick->CTRL before the do-while loop is used?

Additionally, this implementation waits for the internal SysTick Current Value Register to tick to zero, then determines the number of microseconds that have elapsed based on this. This seems to lower the resolution of the returned value. Is it possible to get the number of nanoseconds that have elapsed with a similar function to micros() that doesn't wait for SysTick->VAL to count down without sacrificing much performance? I understand the resolution of the function wouldnt be overly impressive as there is ~12ns per clock tick.

It appears that many programmers are allergic to comments.

Is it necessary for this while loop to be there at all?