Should I outsource my counting to something external? Micros is too slow.

Hello!

I have an application where a lot of counting is involved. I'm measuring the timing of an event similar to that of a chronograph that measures the speed of a bullet.

I have about an 8 microsecond delay between each of these steps in my code.

		InitMicros = micros();
		Nano2_data.CH1TrigTime_Rising = 0;
		while(digitalRead(CH1_TRIG) == HIGH && !Nano2_data.TrigTimeOut);
		Nano2_data.CH1TrigTime_Falling = (unsigned int)(micros() - InitMicros);
		while(digitalRead(CH2_TRIG) == LOW && !Nano2_data.TrigTimeOut);
		Nano2_data.CH2TrigTime_Rising = (unsigned int)(micros() - InitMicros);
		while(digitalRead(CH2_TRIG) == HIGH && !Nano2_data.TrigTimeOut);
		Nano2_data.CH2TrigTime_Falling = (unsigned int)(micros() - InitMicros);
		while(digitalRead(CH3_TRIG) == LOW && !Nano2_data.TrigTimeOut);
		Nano2_data.CH3TrigTime_Rising = (unsigned int)(micros() - InitMicros);
		while(digitalRead(CH3_TRIG) == HIGH && !Nano2_data.TrigTimeOut);
		Nano2_data.CH3TrigTime_Falling = (unsigned int)(micros() - InitMicros);

I realize there is A LOT of extra work I'm putting between things and I can probably minimize that 8 microseconds considerably by doing all the math and unit conversion at the end (just store the long integers and process them later). Still, I'm looking for something on the order of 1us precision in timing. I also don't like how I have to sequence and that if the first "HIGH" on CH1_TRIG doesn't happen, none of it happens.

I'm thinking about using some external counters for this application. I'm finding a few I2C ones that MAXIM makes. They will HAVE to be I2C because I don't think I could support a basic binary counter that has 16+ pins in my project. However none of the ones I've seen have the features I'm looking for.

Ideally I would like to be able to setup 6 of these binary counters, each one looking for a different event (Channel 1 - 3 , HIGH and LOW). There would be a "START" pin and "STOP" pin. The arduino would transmit a "START" signal to each counter simultaneously. Each of my input signals would then be tied to corresponding "STOP" pins. At the end of an "event" the Arduino will download (via I2C) the data from each counter.

Can anyone give me pointers to hardware that might be useful for this? Thank you!

This looks like something similar to what I'm looking for, but it is too slow.

http://datasheets.maximintegrated.com/en/ds/DS1683.pdf

I'm looking for something on the order of 1us precision in timing.

Over what duration? Second? Minute? Hour? Day?

Good question. Over the duration of about 10 milliseconds :).

Board?

I'm sorry? I don't follow what you are saying. Board?

Looking at my measurement, my whole event should be between 200 - 1200us.

Which kind of Arduino do you have? (Though it probably won't matter.)

I'm partial to the Arduino Nanos have will probably use that in this application. But I also have an Arduino UNO and a Seeduino Mega.

I can probably spring for one of the newer boards if you think it would make a difference.

There are two obstacles...

  1. The granularity of micros is 4us. You will need something with finer granularity.

  2. At 16 MHz, there are only 16 clock cycles (~16 machine instructions) available between timing marks.

I suspect what you want to do will work but it will very likely require either three input-capture peripherals (available on 16 bit timers) or hand-coded assembly. The ATmega2560 processor family has enough timers but I believe only two input-capture pins are brought out to a header... yup...

35 PL0 ( ICP4 ) Digital pin 49
36 PL1 ( ICP5 ) Digital pin 48

Ready for some assembly?

Can I reduce the task down to assembly but have the rest of my code in C? Or will I need to reduce (or rather, expand) my codebase into all assembly?

How do you purpose I do the assembly for this application?

One thing that I do not like about my current approach is that if for some reason the first "HIGH" on my "channel 1" is not observed, then the whole thing is missed. I was hoping to come up for a solution to still have some measurement data in the event that something is missed. That is what lead me to start thinking about a peripheral solution being the best one for this application but am coming up short on my Google-Fu for parts...

Hmm...maybe I shouldn't be looking at multiple pins, but bring everything through an OR gate. Meh..that would work if my spacing between channels were the same, but it isn't..

I spent my last semester in college doing nothing but assembly on a PIC Microcontroller. I wrote a library of I2C code. So I have some familiarity. But, I do love my high-level languages.

On second read: Ah, I see. You are suggesting using the 16-bit timers that are part of the mega and writing some assembly to use them properly.

Thats probably the best solution. The fact that there are only two inputs shouldn't be too much of a problem. I could juggle inputs around with some gate logic, and tbh the first channel is not as important (timing wise) as channels 2 and 3 are.

Yeah, that's not going to work too well if you really care about repeatability and accuracy. You can achieve better than 1uS resolution if you need with great accuracy and repeatability if you use the Input Capture Facility feature of Timer1. I have posted a previous example that would capture the time between two rising edges for the full cycle time, but it could be easily modified to just capture the time between a rising then falling edge by changing the edge detected in the interrupt handler.

The ICF is the only truly accurate way to measure pulse timings since it's all done in hardware. This means that you won't get variations caused by interrupt handlers running or other things that might be going on.

If you're interested, I could post a modified version to time between each edge instead of the total pulse length.

EDIT: I see you want to watch three different pins now, that makes things a little tougher. An external MUX might solve the problem easy enough though.

Nickerbocker:
Can I reduce the task down to assembly but have the rest of my code in C?

Yes.

Or will I need to reduce (or rather, expand) my codebase into all assembly?

Only if you really want to do that.

How do you purpose I do the assembly for this application?

This is what I would do...

• Initialize for collection
• Disable interrupts
• Structure the collection loop so one pass is exactly 1us
• Loop for exactly 1200us
• Use "port manipulation" to read all six inputs at once
• If the current port value is different from the previous port value, record the value and the loop count (which is conveniently the number of microseconds)
• Enable interrupts
• Sift through the collected data to determine what happened

The advantage to this strategy is that it is simple, will work on any AVR processor running at high enough speed, and just may work.

The disadvantage is that nothing else can happen during collection (hence disabling interrupts).

But, I do love my high-level languages.

Most of us do. Which is why you code only the very critical bit in assembly.

Edit: spelling

Nickerbocker:
On second read: Ah, I see. You are suggesting using the 16-bit timers that are part of the mega and writing some assembly to use them properly.

No. If the Mega brought out three (or four) Input Capture pins then it would be a viable solution (and probably would not require assembly) but only two Input Capture pins are available.

Thats probably the best solution. The fact that there are only two inputs shouldn't be too much of a problem. I could juggle inputs around with some gate logic, and tbh the first channel is not as important (timing wise) as channels 2 and 3 are.

The risk is when two events occur close to the same moment. Input Capture works best when there is about 6 microseconds (or more) between events.

afremont - Thanks for your post. I'll do a search for your code. I'll post back if I have questions. I'm currently using Timer1 via the TimerOne library to cause a timeout interrupt, but this is probably more important. I should probably be using a watchdog for that anyway.

Coding Badly - Thanks for the pseudo code. That might work well for me because I currently have a 2 Arduino setup on my prototype PCB. 1 Arduino is handling the application (talking to the LCD screen and the PC if it is connected) and the other Arduino just stares at input pins. Would like to avoid deviating much from my current board layout so that I can't get something other than a barebone board made soon.

There is a potential that the difference in time between the rising and falling edge would be less than 6 microseconds, but I estimate the time between channels to be about around 200-300 microseconds. Capturing both the rising and falling edges would be nice but I could choose one or the other and still meet the requirements of my application. I'd like to be in the sub 1% for this measurement (just the Arduino itself, not taking into account other uncertainties). That's my goal anyway.

Nickerbocker:
There is a potential that the difference in time between the rising and falling edge would be less than 6 microseconds...

Polling the Input Capture flag (versus using an interrupt) may get the overhead low enough for collection to work.

I would first pursue the solution in Reply #12. If it works it is trivial to prove that it will always work (really, the only way it can fail is a buffer overrun). As the time between transitions because smaller it becomes increasingly difficult to prove that Input Capture is reliable. If you don't get the captured value out of the register in time the value is lost and there is no way to detect the failure. Or, if you cannot switch from rising to falling fast enough you will miss the transition.

I was going to suggest throwing hardware at it. ATmegas are inexpensive enough, use one for each of the three channels. They could communicate to a master over I2C or whatever.

I still would have a couple concerns. @CB, where does the 6µs number come from? That is nearly 100 clocks, could the input capture ISR be that short? That seems like it should be enough, not sure whether that might mandate assembler though.

Also have a concern about the following statement, especially if very short intervals (6µs or less, but what is the minimum expected?) need to be measured. @Nickerbocker, were you saying 1% accuracy for the measurement?

I'd like to be in the sub 1% for this measurement (just the Arduino itself, not taking into account other uncertainties).

A quick pass in C...

typedef struct
{
  uint16_t    stamp;
  uint8_t     value;
}
mydata_t, *mydata_p;

static mydata_t buffer[12];

static void Capture( void )
{
  mydata_p    p;
  uint8_t     v0;
  uint8_t     v1;
  uint16_t     i;
  
  memset( &buffer[0], 0xFF, sizeof(buffer) );
  p = &buffer[0];
  v0 = PINB & 0b00111111;
  for ( i=0; i < 1200; ++i )
  {
    v1 = PINB & 0b00111111;
    if ( v1 != v0 )
    {
      p->stamp = i;
      p->value = v1;
      v0 = v1;
      ++p;
    }
  }
}

void setup( void )
{
}

void loop( void )
{
  Capture();
}

The relevant generated code is (Capture was inlined)...

000000b6 <loop>:
  b6:	80 e0       	ldi	r24, 0x00	; 0
  b8:	91 e0       	ldi	r25, 0x01	; 1
  ba:	6f ef       	ldi	r22, 0xFF	; 255
  bc:	70 e0       	ldi	r23, 0x00	; 0
  be:	44 e2       	ldi	r20, 0x24	; 36
  c0:	50 e0       	ldi	r21, 0x00	; 0
  c2:	0e 94 53 00 	call	0xa6	; 0xa6 <memset>
  c6:	23 b1       	in	r18, 0x03	; 3
  c8:	2f 73       	andi	r18, 0x3F	; 63
  ca:	e0 e0       	ldi	r30, 0x00	; 0
  cc:	f1 e0       	ldi	r31, 0x01	; 1
  ce:	80 e0       	ldi	r24, 0x00	; 0
  d0:	90 e0       	ldi	r25, 0x00	; 0

  d2:	33 b1       	in	r19, 0x03	; 3
  d4:	3f 73       	andi	r19, 0x3F	; 63
  d6:	32 17       	cp	r19, r18
  d8:	21 f0       	breq	.+8      	; 0xe2 <loop+0x2c>
  da:	91 83       	std	Z+1, r25	; 0x01
  dc:	80 83       	st	Z, r24
  de:	32 83       	std	Z+2, r19	; 0x02
  e0:	33 96       	adiw	r30, 0x03	; 3
  e2:	01 96       	adiw	r24, 0x01	; 1
  e4:	24 e0       	ldi	r18, 0x04	; 4
  e6:	80 3b       	cpi	r24, 0xB0	; 176
  e8:	92 07       	cpc	r25, r18
  ea:	11 f0       	breq	.+4      	; 0xf0 <loop+0x3a>
  ec:	23 2f       	mov	r18, r19
  ee:	f1 cf       	rjmp	.-30     	; 0xd2 <loop+0x1c>

  f0:	08 95       	ret

This is the important bit with clock cycles...

  d2:	33 b1       	in	r19, 0x03	; 3		1
  d4:	3f 73       	andi	r19, 0x3F	; 63		1
  d6:	32 17       	cp	r19, r18			1
  d8:	21 f0       	breq	.+8      	; 0xe2 <loop+0x2c>		2
  da:	91 83       	std	Z+1, r25	; 0x01		2
  dc:	80 83       	st	Z, r24			2
  de:	32 83       	std	Z+2, r19	; 0x02		2
  e0:	33 96       	adiw	r30, 0x03	; 3		2
  e2:	01 96       	adiw	r24, 0x01	; 1		2
  e4:	24 e0       	ldi	r18, 0x04	; 4		1
  e6:	80 3b       	cpi	r24, 0xB0	; 176		1
  e8:	92 07       	cpc	r25, r18			1
  ea:	11 f0       	breq	.+4      	; 0xf0 <loop+0x3a>		2
  ec:	23 2f       	mov	r18, r19			1
  ee:	f1 cf       	rjmp	.-30     	; 0xd2 <loop+0x1c>		2

We're over budget by 7 cycles and the code will need at least one more rjmp. I think two instructions can be moved or eliminated but that only shaves 2 cycles.

As an added insult, I forgot about Nyquist. Input Capture is the correct choice.

Excellent idea! An ATtiny84 for each of the three lines (it has a 16 bit timer with input capture [nearly?] identical to the ATmega328).

I still would have a couple concerns. @CB, where does the 6µs number come from?

Testing and recollection. The testing was very likely done well. The recollection may be an issue. I remember being able to just barely get something almost useful out of 2us and decided I was wasting my time. :smiley:

That is nearly 100 clocks, could the input capture ISR be that short?

Doesn't need to be very long. Basically, the captured value is saved (in my case, to a ring buffer) and the edge detection is toggled. The time intensive stuff is done outside the interrupt service routine.

That seems like it should be enough...

It's going to have to be. Polling isn't going to work.

...not sure whether that might mandate assembler though.

Let the hardware do the dirty work (timestamping) so we don't have to (assembly).