The version of pulseIn I sent up for 0008 loses that funny fudge factor. It is using function that aren't in 0007, but I think you can see the timing factors. It is accurate down to 1 (more or less) microsecond.
unsigned long pulseIn(uint8_t pin, uint8_t state)
{
// cache the port and bit of the pin in order to speed up the
// pulse width measuring loop and achieve finer resolution. calling
// digitalRead() instead yields much coarser resolution.
uint8_t bit = digitalPinToBitMask(pin);
uint8_t port = digitalPinToPort(pin);
uint8_t stateMask = (state ? bit : 0);
unsigned long width = 0; // keep initialization out of time critical area */
// wait for the pulse to start
while ( (*portInputRegister(port) & bit) != stateMask)
;
// wait for the pulse to stop
while ( (*portInputRegister(port) & bit) == stateMask)
width++;
// convert the reading to microseconds. The loop has been determined
// to be 10 clock cycles long and have about 12 clocks between the edge
// and the start of the loop. There will be some error introduced by
// the interrupt handlers.
return clocksToMicroseconds( width * 10 /*clocks/loop*/ +
12 /* approximate clocks from edge to loop*/);
}
That is calibrated by counting instruction cycles and verified experimentally. As far as a square wave signal generator, you can program the timer 1 and timer 2 channels to put out square waves on digital 9, 10, and 11. Very handy for testing.