Jitter in the main() loop?

I've been running this program:

unsigned short buf[200];
long prev = 0;
unsigned int n = 0;

void setup() {
  pinMode(13, OUTPUT);
  digitalWrite(13, HIGH);
  Serial.begin(115200);
  delay(50);
  digitalWrite(13, LOW);
}

void loop() {
  long us = micros();
  long diff = us - prev;
  if (diff > 199) {
    diff = 199;
  }
  buf[diff] += 1;
  prev = us;
  n += 1;
  if (n == 50000) {
    Serial.print("us,N\r\n");
    for (int i = 0; i < 200; ++i) {
      if (buf[i]) {
        Serial.print(i);
        Serial.print(",");
        Serial.print(buf[i]);
        Serial.print("\r\n");
      }
    }
    delay(1000);
    prev = micros();
    n = 0;
  }
}

This calculates a distribution of latency of executing loop() as a histogram. I would expect this to either put all the data into one bucket, or to distribute the samples across two buckets (if there is some granularity in the counter and you get time aliasing).

However, I get something really weird:

us,N
8,237
12,50468
16,1871
20,815
199,1

This tells me that the loop() function, and/or the counter, may jitter a fair bit, for some reason. Note that I wait for the serial port to drain after dumping the data, so I don't think it's the Serial "driver."
If there is timing aliasing (say, the micros() counter is quantized to 4 us), then I would expect the samples to be spread across only two buckets.

Microcontrollers are supposed to be deterministic! This is on a R2 Uno, with no peripherals. Why is this happening?

Microcontrollers are supposed to be deterministic!

The AVR8 processor is perfectly deterministic. Everything is synchronized to the system clock and each machine instruction takes a precise number of clock cycles to execute.

jwatte:
If there is timing aliasing (say, the micros() counter is quantized to 4 us), then I would expect the samples to be spread across only two buckets.

There is code in the core and code in your sketch that depends on the previous state of the hardware (previous runs of loop). So long as those dependencies exist, it will be extremely difficult for you to test the "determinism". Some examples...

  • There is a branch in the timer 0 overflow handler. Branch taken means loop runs just a bit longer than expected.

  • The delay(1000); in your sketch can leave loop at a different synchronization point relative to timer 0 than previous runs of loop.

  • The Serial calls in your sketch can leave loop at a different synchronization point relative to timer 0 than previous runs of loop.


To make the situation even more complicated, if either your code or the timer 0 handler uses a prime number of clock cycles to run then the full cycle will be extremely long. In other words, it could take hours, days, months, years, forever for your test to produce two identical runs.

I was about to test your program with interrupts off, but of course then you couldn't time anything!

Well, as one of the regular posters here has in his signature "measurement changes behaviour".

Isn't this something to do with quantum theory? Or someone's cat? You would get more precise behaviour if you didn't measure the behaviour.

The point of the delay() is to remove the jitter from the Serial port -- it was jittering up to 40 microseconds before I inserted that!

I had assumed that millis() and micros() would be wrappers to bare CPU instructions that read internal counters. It sounds like this is not the case. Instead, the Arduino library sounds like a thin veil on top of the hardware, with undocumented side effects that I have to guess at.
I don't like guessing :slight_smile:
I suppose I can go look at the source or something...

I almost wish there was a 100 MHz part that came in DIP so I could throw cycles at the problem without breaking the bank :slight_smile:

You don't have to guess. The source for millis() and micros() is there somewhere. I usually do a "find in files" to find them.

There is a timer interrupt that catches the overflow of the timer used by millis() and micros(). That adds to a counter and mucks around a bit. If the timer fires at the "wrong" time your loop will jitter.

You can remove the jitter by disabling interrupts, if your code will work adequately with that done.

Here, I found it in wiring.c:

SIGNAL(TIMER0_OVF_vect)
{
	// copy these to local variables so they can be stored in registers
	// (volatile variables must be read from memory on every access)
	unsigned long m = timer0_millis;
	unsigned char f = timer0_fract;

	m += MILLIS_INC;
	f += FRACT_INC;
	if (f >= FRACT_MAX) {
		f -= FRACT_MAX;
		m += 1;
	}

	timer0_fract = f;
	timer0_millis = m;
	timer0_overflow_count++;
}

So not only does the interrupt periodically fire, causing jitter, but the "if" test inside it, if met, will cause additional jitter because it causes more instructions to be executed.

jwatte:
I almost wish there was a 100 MHz part that came in DIP so I could throw cycles at the problem without breaking the bank :slight_smile:

"the problem"? What problem is that?

12,50468
16,1871
20,815

It takes around 3.5 uS to enter an ISR and about the same to leave it. So I read into that, that some of the time you entered, or left, the ISR during the timing period (4 uS), and some of the time you did both (8 uS). That sounds about right.

Correct me if I'm wrong, but wouldn't it be impossible to just enter or leave the ISR? Unless there are two execution cores, that is :slight_smile: If you take the interrupt, you have to complete it before the main code gets its time back, so the minimum disruption would be 7 us. (Push and pop take two cycles each, and there's half a dozen of each in an ISR -- grumble! How hard would it be to provide a second set of registers for ISRs, anyway :wink:

Anyway, I'm re-thinking my design, working towards a loop that can run with interrupts off in the timing sensitive parts. Also, I'm replacing digitalWrite (a whopping 78 instructions!) with some port banging myself, too. Now, where did I put that avr-gcc intrinsics reference ...

In your timing portion, it might have already entered the ISR, and leave it in the middle. Or enter it towards the end. So yes, I think it is completely possible for an ISR to partially affect your timing.

I want to be able to toggle output pins, driving communications hardware, ideally with microsecond precision (but I can tolerate a handful of microseconds of jitter) based on some input that arrives asynchronously. With the Atmega328p, I can do the driving when turning off interrupts, but not receive data at the same time. For prototyping, I can probably live with this.
At 100 MHz, there would be enough cycles to do everything polled at the same time -- output and input, polled. Or run the input on interrupts, and the output polled but with interrupts on, and take a small amount of jitter when interrupts arrive.

This guy runs at 96 MHz and is price competetive with an Uno, but there is no DIP version :frowning:
http://parts.digikey.com/1/parts/1950073-board-lpcxpresso-lpc1768-1769-om13000.html
I could conceivably even use the built-in SPI DMA hardware, ignoring the clock, and just using the bit-out signal...

Hmm, the 328 does have some SPI capability, but not DMA. I wonder if it at least has a FIFO? (... goes off reading data sheets some more)

some input that arrives asynchronously

From?

I wonder if it at least has a FIFO

No it hasn't.

The problem here is that you have a system with all sorts of asynchronous events taking place that makes precise timing not possible.
There are three internal timers and for the best precision you should use these. But this is the price you pay for having a system and not just a raw processor.

jwatte:
I want to be able to toggle output pins, driving communications hardware, ideally with microsecond precision (but I can tolerate a handful of microseconds of jitter) based on some input that arrives asynchronously.

Perhaps if you describe the actual requirement, rather than "nice to have"? Most comms stuff is tolerant of some delays, as it works in real-life situations. Even quite fast processors (eg. modern Macs, PCs, Linux) running at 3 GHz still has to service interrupts. If you said you wanted to work with fast USB, yes you probably can't get that to work on the bare board. But then there are USB interface chips, so that isn't a particular worry. Ditto for Ethernet.

Argh! Couldn't they have spared the handful dozen gates for a one-byte FIFO for that SPI interface?

In my target system, the controller decodes IR input on a wide range of bands (16 kHz through 60 kHz carriers), and sends out the serial port.
At the same time, it needs to also receive commands from a serial port (where I can control the command rate, so I can manage the interrupt load) and hard-wired buttons.
While sending commands with high precision, I can let the hardware serial back up, and delay processing button presses (but I'll probably want to OR together the input bits received during the time.)
The end result is automation of a number of different IR remote control protocols.

And it's funny you should mention high-speed PCs -- ten to fifteen years ago, I spent years working on an operating system that drove interrupt latencies on then-standard PC hardware with a general-purpose GUI down below milliseconds (for media production.) Even with modern hardware, neither Windows, nor Linux, nor MacOS will get to those levels. That's because they do many things at once, and use "cheapest possible" design instead of dedicated circuits for many things.

In effect, I want to use a microcontroller as the dedicated circuit for what I want to do. If that SPI interface had at least one byte of FIFO, then I probably could do it just fine (mashing in another output byte when the FIFO runs dry) but as it is, I have to take a 4 us interrupt every 8 bits, which means that every 8th bit I send out essentially gets extended by 4 us or so.

The reason I have such strict tolerances is because I need to generate the carrier wave for the IR modulation, at between 16 and 60 kHz. If I want to stick with an Atmega328P, I may have to use a separately programmable timer to generate the carrier, and then use the Arduino only as gate for that carrier (it's all Manchester coded AM -- at least I don't have to do FM in software :slight_smile:

So, a 555, with a variable timing resistor, might get me there. But then the external circuitry is looking a lot hairier, and maybe I should go with some of the bigger boysthat have DMA to SPI for seamless modulation.

I can build a state machine to do exactly what I need, and count cycles from interrupt handlers to figure out what my budget is -- for example, I can bang a byte to the SPI, then enable interrupts, then immediately disable interrupts; as long as the longest interrupt handler is shorter than the time to send one byte out the SPI, I'm good. With 4 us pulse width (not ideal), this means < 30 us interrupt latency, which can be done on the current board. However, that's 4 us pulse width, not 1 us. With a device that runs faster, and has better circuitry for generating the pulse forms I care about (DMA, say), my target would probably be easier to reach.

Maybe the solution really is a LPC1768 for communications and smarts, and a 328P that just does pulse generation, using SPI for receive (which has a one-byte buffer).

Argh! Couldn't they have spared the handful dozen gates for a one-byte FIFO for that SPI interface?

Page 167 of the data sheet:

The system is single buffered in the transmit direction and double buffered in the receive direction. This means that bytes to be transmitted cannot be written to the SPI Data Register before the entire shift cycle is completed. When receiving data, however, a received character must be read from the SPI Data Register before the next character has been completely shifted in. Otherwise, the first byte is lost.

So there is a two-byte receive buffer, and a one-byte send buffer. :slight_smile:

There's also an interrupt "SPI Interrupt Flag". It looks like you could use that to stuff the next byte into the SPI buffer when the previous one has been sent.

You could look into the Kemani CPLD Key or his similar products to use a CPLD as a high-speed interface.

http://majolsurf.net/wordpress/?page_id=1302

But what I want is a one byte FIFO to feed that one byte buffer!
Also note I described the interrupt solution. It essentially extends every eighth bit by a handful microseconds because of the time taking the interrupt and starting the next byte.
Really, a variable frequency timer might be best here...

So here's where I'm at.

Spinning up a LPC 1768 is significant new effort what with new toolchain, harder final production (surface mount only) etc. I'll go with a carrier generator modulated by the Arduino, as I have less stringent requirements for the modulator (10 usec jitter is ok)

Trying to control R2 of a 7555 to generate a stable carrier requires more than just a JFET or BJT -- expensive analog stuff. I want a programmable timer!

Something like a 8254 would work. Bug those are expensive and require wide parallel data interfacing.
The cheapest programmable timer in DIP I can find is an ATtiny for $1.19. But that's a new toolchain again.

A 328 is less than $4. Add a crystal or resonator, some resistors/caps, and a socket. I now have a programmable timer I can talk SPI or I2C or UART to! I might even be able to share crystal between the two in the final implementation.

ATtiny ... But that's a new toolchain again.

How so?