Making the compiler comply

I have this input capture ISR:

// Interrupt Capture vector

ISR(TIMER1_CAPT_vect) {

  if ( bit_is_set(TCCR1B , ICES1)) {    // was rising edge detected ?
      TCNT1 = 0; // set start time to zero. 
      TCCR1B ^= _BV(ICES1);                 // toggle bit value to trigger on the other edge
  }
  else {                                // falling edge was detected
    TCCR1B ^= _BV(ICES1);                 // toggle bit value to trigger on the other edge
    eVal = ICR1;                        // save the end time
  }
}

As you can see, the code for each edge of the input contains TCCR1B ^= _BV(ICES1); to flip the edge sense in preparation for the next input change of state. I’m basing this on the datasheet statement that edge sense should be adjusted as soon as possible in the ISR. An assembly listing shows that the compiler thinks differently about how this should be done - address 506.

 // Interrupt Capture vector

ISR(TIMER1_CAPT_vect) {
 4d0:	1f 92       	push	r1
 4d2:	0f 92       	push	r0
 4d4:	0f b6       	in	r0, 0x3f	; 63
 4d6:	0f 92       	push	r0
 4d8:	11 24       	eor	r1, r1
 4da:	8f 93       	push	r24
 4dc:	9f 93       	push	r25

  if ( bit_is_set(TCCR1B , ICES1)) {    // was rising edge detected ?
 4de:	80 91 81 00 	lds	r24, 0x0081	; 0x800081 <__TEXT_REGION_LENGTH__+0x7e0081>
 4e2:	86 ff       	sbrs	r24, 6
 4e4:	05 c0       	rjmp	.+10     	; 0x4f0 <__vector_10+0x20>
//    sVal = ICR1;                        // save the start time
    TCNT1=0;  // set start time to zero.  
 4e6:	10 92 85 00 	sts	0x0085, r1	; 0x800085 <__TEXT_REGION_LENGTH__+0x7e0085>
 4ea:	10 92 84 00 	sts	0x0084, r1	; 0x800084 <__TEXT_REGION_LENGTH__+0x7e0084>
 4ee:	08 c0       	rjmp	.+16     	; 0x500 <__vector_10+0x30>
    // requires 8 clocks.  ICR will latch the pulse width (in cycles)
    // accumulated starting from zero without requiring later subtraction.

  }
  else {                                // falling edge was detected
    eVal = ICR1;                        // save the end time
 4f0:	80 91 86 00 	lds	r24, 0x0086	; 0x800086 <__TEXT_REGION_LENGTH__+0x7e0086>
 4f4:	90 91 87 00 	lds	r25, 0x0087	; 0x800087 <__TEXT_REGION_LENGTH__+0x7e0087>
 4f8:	90 93 33 01 	sts	0x0133, r25	; 0x800133 <eVal+0x1>
 4fc:	80 93 32 01 	sts	0x0132, r24	; 0x800132 <eVal>
  }
  TCCR1B ^= _BV(ICES1);                 // toggle bit value to trigger on the other edge
 500:	90 91 81 00 	lds	r25, 0x0081	; 0x800081 <__TEXT_REGION_LENGTH__+0x7e0081>
 504:	80 e4       	ldi	r24, 0x40	; 64
 506:	89 27       	eor	r24, r25
 508:	80 93 81 00 	sts	0x0081, r24	; 0x800081 <__TEXT_REGION_LENGTH__+0x7e0081>
}
 50c:	9f 91       	pop	r25
 50e:	8f 91       	pop	r24
 510:	0f 90       	pop	r0
 512:	0f be       	out	0x3f, r0	; 63
 514:	0f 90       	pop	r0
 516:	1f 90       	pop	r1
 518:	18 95       	reti

What goes down to the chip has only one EOR instruction. How do I force the compiler to assemble this as written?

The compiler factored out the XOR, common to both branches. Why do you think this won’t work?

You can turn off optimization, at significant cost to execution efficiency and code size.

What does this get you?

ISR(TIMER1_CAPT_vect) {
  TCCR1B ^= 1 << ICES1;

  if (TCCR1B & (1 << ICES1)) {
    eVal = ICR1;
  }
  else {
    TCNT1 = 0;
  }
}

jremington:
The compiler factored out the XOR, common to both branches. Why do you think this won't work?

I believe it will work. I was just trying to (misguidedly?) squeeze a little more responsiveness out of the ISR - to be able to catch a shorter pulse by toggling the edge select sooner than it would have been done otherwise.

gfvalvo:
What does this get you?

I was considering something along that line. Let me try it.

This bit is the crucial part of a small test program. Anyway, the lowest reported pulse count previously was eleven. Shortening the pulse generator, TC1, below ~41 (clocks) would bring up varying large numbers. Like 58xxx. Effects of overflow, I’m guessing. Putting the toggle at the front I can now adjust the pulse width down to 36 clocks and reduce that number to zero. A big improvement! I don’t know exactly what this means though. I’m going to count clock cycles to see if *that * number has any relation to the values I’m seeing and using now. I may have to think about an o’scope. :slightly_frowning_face:

Excluding all the ISR push/pop overhead I am left with this, which more closely conforms to the C++.

 TCCR1B ^= 1 << ICES1;
 4de:	90 91 81 00 	lds	r25, 0x0081	; 0x800081 <__TEXT_REGION_LENGTH__+0x7e0081>
 4e2:	80 e4       	ldi	r24, 0x40	; 64
 4e4:	89 27       	eor	r24, r25
 4e6:	80 93 81 00 	sts	0x0081, r24	; 0x800081 <__TEXT_REGION_LENGTH__+0x7e0081>

  if (TCCR1B & (1 << ICES1)) {
 4ea:	80 91 81 00 	lds	r24, 0x0081	; 0x800081 <__TEXT_REGION_LENGTH__+0x7e0081>
 4ee:	86 ff       	sbrs	r24, 6
 4f0:	09 c0       	rjmp	.+18     	; 0x504 <__vector_10+0x34>
    eVal = ICR1;
 4f2:	80 91 86 00 	lds	r24, 0x0086	; 0x800086 <__TEXT_REGION_LENGTH__+0x7e0086>
 4f6:	90 91 87 00 	lds	r25, 0x0087	; 0x800087 <__TEXT_REGION_LENGTH__+0x7e0087>
 4fa:	90 93 33 01 	sts	0x0133, r25	; 0x800133 <eVal+0x1>
 4fe:	80 93 32 01 	sts	0x0132, r24	; 0x800132 <eVal>
 502:	04 c0       	rjmp	.+8      	; 0x50c <__vector_10+0x3c>
  }
  else {
    TCNT1 = 0;
 504:	10 92 85 00 	sts	0x0085, r1	; 0x800085 <__TEXT_REGION_LENGTH__+0x7e0085>
 508:	10 92 84 00 	sts	0x0084, r1	; 0x800084