DelayMilliseconds() is off by roughly a factor of 2

I searched the forums and found a few details about the Delay() and DelayMilliseconds() functions not being perfectly accurate under certain conditions, but none of the information seemed to help me get going in the direction I need.

What I am trying to do is programatically turn a square wave on a given I/O pin at different frequencies and timings.

I am using the Tone() function to create the square wave and I am using the delay functions to set how long the wave is output. (see code below)

The problem that I have run into is that if i am trying to delay for some number of microseconds and then measure the square wave on my digital oscilloscope, the time the square wave is output is usually about twice as long as the delay I had set (regardless of whether the delay is 500 microseconds or 8000 microseconds).

Here's an example:

          int pin = 13;
          long frequencyIn = 38028;
          int onDelayMicros = 8918;

           tone(pin,frequencyIn); 
           delayMicroseconds(onDelayMicros);
           noTone(pin);

From what i understand, this should provide a square wave at the given frequency for roughly 9 milliseconds, but i get a wave that lasts right around 19 milliseconds.

Next i tried changing to the following:

          int pin = 13;
          long frequencyIn = 38028;
          int onDelayMillis = 8;

           tone(pin,frequencyIn); 
           delay(onDelayMillis);
           noTone(pin);

Using the millisecond delay, i was able to get it closer to what i needed. This produced a square wave that lasted roughly 8.15 milliseconds. Not perfect, but close enough.

The problem is that i need more control over the duration than what this gives me.

So next i tried this:

          int pin = 13;
          long frequencyIn = 38028;
          int onDelayMillis = 8;
          int onDelayMicros = 918;

           tone(pin,frequencyIn); 
           delay(onDelayMillis);
           delayMicroseconds(onDelayMicros);
           noTone(pin);

This should have only added about .9 milliseconds to the timing, putting me around 9 milliseconds total, but i end up at right around 10.1 milliseconds.

So the delayMicroseconds() command is again delaying a little more than twice as long as what i'm trying to have it do.

From what I've tried the problem does seem to center around the delayMicroseconds() functions and not the tone() or noTone() functions.

Is there an alternative method for setting the time the square wave will be produced, with timing resolutions in the microseconds? The Tone function does have a duration parameter, but according to the reference it is in milliseconds, not micro.

Is it possible that something else in this causing the increase in delay?

Thanks, in advance, for your help.

Search for "piano tones micros" where I made 13 tones simultaneoulsy using blink without delay style coding.
Instead of playing while a button is pushed, modify the code to turn on & off based on time.