Is this the best way to read rotary encoder?

Hi,

I am using a rotary encoder by reading the pin with attachInterrupt(pin, ISR, CHANGE) instead of RISING or FALLING, satisfactorily. But I would like to share and discuss it with the forum. Let me post my code first (I am using Arduino Mega 2560):

volatile long encoderPos = 0;

//both pin 21 and 20 is in PORTD. pin 21 is PIND>>PD0, pin 20 is PIND>>PD1
attachInterrupt( digitalPinToInterrupt(21), encode, CHANGE );

void encode() // interrupt service routine
{
if( ((PIND>>PD0) & 1) ^ ((PIND>>PD1) & 1) )
encoderPos++;
else
encoderPos--;
}

I am using both rising and falling edges as one step. First, it doubles the resolution. Second, in my opinion, it simplifies the code (speeds up the execution), but does it?

First I don't have to do all that if-and-else-if thing when only detecting rising or falling edges, because whenever pin 21 state changes (rising or falling), all I have to do is test the difference (the exclusive OR) with pin 20, if different, one direction; if not, the other direction. Here with only one logic test, I get increase or decrease. Although the code looks very neat, one thing I am not sure about is, in this one logic test I have two PORT reads, two binary shifts, and two binary AND, do those all count as single steps or can I lump them all in one step?

I do think using direct port reading speeds things up, BUT is there a better way to do this, instead of reading, shifting ANDing? Or at least I don't want to read the port twice, should read it once and use it twice?

The reason I pick these pins is that they have the highest priority on the data sheet; I don't ever want to lose an encoder pin reading, right?

Appreciate any input.

Hugh

I think I don't have to do the shifting, should be able to access one pin directly. So it should be:

void encode() // interrupt service routine
{
if( ((PIND0) & 1) ^ ((PIND1) & 1) )
encoderPos++;
else
encoderPos--;
}

That makes it even better.

That's incorrect, you cannot process an encoder using only the current state.

The way to process is this:

read the two pins (simultaneously if possible).

Compare the current state of the two pins to the previous state of the two pins.
this will indicate if the count should increase, decrease, or stay the same (no changes).

Record the new pin values as the state for next time.

The state transitions go 00->01->11->10->00 for one direction,
00->10->11->01->00 for the other.

Why? What is this extreme optimization for?

If this is a high speed shaft encoder that is giving you kiloHertz pulses and the ISR is using an unacceptable amount of your CPU time then this may be a good idea. If it is a knob turned by hand then use Paul Stoffregen's Encoder library. It is already highly optimized for all the major Arduino architectures.

This code is highly non-portable. It is difficult even to change pins. When you find that you need to change to a different Arduino for any reason then you have to come back here and fix this part.

Skipping every second transition by only interrupting on one of the two pins is a valid optimization if you don't need the resolution and you don't have enough interrupt pins. Many knob encoders will work well with this because they have two electrical transitions between each mechanical detent. But some won't. Do you know which type you have?

  if( ((PIND0) & 1) ^ ((PIND1) & 1) )This will never work.

XiaopingHugh:
That makes it even better.

Yes! You don't even need to spend money on an encoder using that code.

MorganS: I am using it for a high resolution rotary encoder and doing control stuff, therefore I do want to make as efficient as possible, but not sure how efficient this piece of code is.

boolrules: I now don't think the PIND0 and PIND1 thing will work (MarkT: It does work with the original code, without this change). But I would like to ask this: what is PIND0? What does it do if I get the value for this pin?
(it's not the reading of one pin, and you have to read one port at time, I guess?)

Thanks for all inputs.

Hugh

"As efficient as possible" isn't a specification.

What specific performance problems are you having which require you to do this extreme optimization here? Maybe you can re-factor a floating-point calculation elsewhere in the code to give this section enough time to do its job?

Remember programmer time is much more expensive than CPU time. If this code works then stop trying to optimize it and move on to the next project.

XiaopingHugh:
boolrules: I now don't think the PIND0 and PIND1 thing will work (MarkT: It does work with the original code, without this change). But I would like to ask this: what is PIND0? What does it do if I get the value for this pin?

Exactly. What is PIND0? Somewhere there should be some code that declares what PIND0 is.
I can only suppose that PIND0 might be the D0 pin on some arduino board.
If so, it could be declared as something like:

#define PIND0  5
// or maybe
#define PIND0  D0

I just depends. But given that it is a pin number, then your code should be something like:

if( digitalRead( PIND0 ) ^ digitalRead( PIND1 ) ) {
    // code
}

But it's a guess.

Bool, you need to read up on "direct port manipulation".

Then forget what you read because it is confusing and very rarely required in any Arduino program.

MorganS:
Then forget what you read because it is confusing and very rarely required in any Arduino program.

Can't do that. Currently working on this:

    TC4->COUNT8.CTRLA.bit.MODE = TC_CTRLA_MODE_COUNT8_Val;          // 2-> Set MODE: 8-bit counter
    while( REG_TC4_STATUS & 0x80 )
        wait++;

    TC4->COUNT8.CTRLA.bit.WAVEGEN = TC_CTRLA_WAVEGEN_NPWM_Val;      // 3-> Set WAVEGEN: PWM 
    while( REG_TC4_STATUS & 0x80 )
        wait++;

    TC4->COUNT8.CTRLA.bit.PRESCSYNC = TC_CTRLA_PRESCSYNC_PRESC_Val; // 4-> Set Presync: load counter
    while( REG_TC4_STATUS & 0x80 )
        wait++;

    TC4->COUNT8.CTRLA.bit.PRESCALER = TC_CTRLA_PRESCALER_DIV1024_Val;// 5-> Set Prescaler: DIV256
    while( REG_TC4_STATUS & 0x80 )
        wait++;

MarkT,

I am reading two pins at a time, in the code: if( ((PIND>>PD0) & 1) ^ ((PIND>>PD1) & 1) ):

PIND is the byte containing the reading of PORTD, in which bit 0 is pin 21, and bit 1 is pin 20, by the ATMEL 2560 data sheet. So I shift those bits to bit 0, AND with 1 (masking out all high bits), and then do an exclusive bitwise OR. What the code did was each time pin 21 has changed (interrupt triggered), I compare this pin (21) with the other pin (20), as long as they are different, increase encoder count, otherwise decrease. This code works. If you look at the quadrature signal, whenever a change occurs (be it rising or falling), if the two signals are different, it's one direction, otherwise the other direction. It should work by its theory. I just want to make sure it IS as efficient as I think is. But as Morgan pointed out, efficiency should not concern me until it doesn't work. My concern actually is: what if I do have issue with CPU time, is this code as efficient as it could be?

XiaopingHugh:
MarkT,

I am reading two pins at a time, in the code: if( ((PIND>>PD0) & 1) ^ ((PIND>>PD1) & 1) ):

PIND is the byte containing the reading of PORTD, in which bit 0 is pin 21, and bit 1 is pin 20, by the ATMEL 2560 data sheet. So I shift those bits to bit 0, AND with 1 (masking out all high bits), and then do an exclusive bitwise OR. What the code did was each time pin 21 has changed (interrupt triggered), I compare this pin (21) with the other pin (20), as long as they are different, increase encoder count, otherwise decrease. This code works. If you look at the quadrature signal, whenever a change occurs (be it rising or falling), if the two signals are different, it's one direction, otherwise the other direction. It should work by its theory. I just want to make sure it IS as efficient as I think is. But as Morgan pointed out, efficiency should not concern me until it doesn't work. My concern actually is: what if I do have issue with CPU time, is this code as efficient as it could be?

Yes.

No, because you're using the default attachInterrupt() and regular C interrupt processor. There are a few more cycles you can squeeze out of it. Basically when an interrupt occurs, it has to save the values of the current registers inside the processor, just in case your interrupt code changes one of them. But you know which ones you're going to damage, so you could process the interrupt in your own assembler code which doesn't save all registers.

That kind of optimisation is rarely needed. But you might need it 10 years from now, so it's good to learn about the possibility.

Conversely, a regular C function call knows which registers are used by the calling function and which ones are going to be damaged by the callee, so the optimising compiler does this anyway for most function calls.

Honestly, I've never needed to look into it this deeply. Even if your hobby time is free, you quickly get to the point that buying the $30 Arduino which is faster than the $20 Arduino is the "most efficient" way to solve this problem.