A better RPM Algorithm - accurate from slow to fast RP2040

I wrote a version of this code ~25 years ago. I recently needed some RPM code for an AdaFruit Adalogger RP2040 and I was not not too excited about the examples I found. So I recreated the code and thought I would post it for others. I did not come across other code like this – maybe something similar is out there.

This is better way to measure RPM over a wider frequency range with good accuracy at both low speeds and high speeds. This code will measure from 60 RPM to over 1 million RPM on the AdaFruit AdaLogger RP2040 I tested it on.

The "counting pulses over a given time" method can work well enough for fast signals, but goes to pot for slow signals. It also has a jitter issue since the sample window is not synchronized with the incoming pulses.

The "time between edges" works well for slow signals, but has increasing quantization errors for fast signals.

This method is a hybrid that counts edges and measures average period between edges with microsecond-ish accuracy.

The idea is to have a given sample window. This window should be long enough to guarantee that at least two of your lowest RPM rising edges can be captured.

For instance, I want to capture 60 RPM with 4 PPR. The edge frequency is:

(60RPM/60)*4 = 1Hz*4 = 4Hz (250mS period)

To guarantee that I will capture at least two edges, I need to sample for at least 500 mS.

This algorithm captures the time (using "micros()", which may not be ideal) of the first rising edge (if there is one) within the sampling window, and any subsequent rising edges within the sampling window.

When the sampling window closes, the number of edges, first edge time and last edge time are captured for the foreground to decode:

If edge_count is 0 or 1 then the RPM is below what can be read within your sample window.

If edge count is 2 or more, the RPM can be calculated:

RPM= (captured_edge_count-1)*(60*1000000)/((captured_last_edge_uS-captured_first_edge_uS)\*PPR)

There can be some big numbers involved, so you might need some type casting.

Written / tested on the Adafruit AdaLogger RP2040

Peace.

-Baxsie

Here is the code

RPM_Test.ino (7.5 KB)

Thanks for sharing!

Thanks for sharing - may be worth moving to showcase or tutorials ?

Do you want it moved to the showcase category ?

That is perfectly fine with me.

I made some posts here of many many years ago but I haven't been active much since then.

Wherever it's appropriate.

REQUEST FOR SOME HELP MAKING THIS EVEN BETTER

Ideally, I would like to have a hardware timer that could be read inside the ISR in a deterministic manner to timestamp the edges.

Any free-running timer being clocked at 1MHz or more (as long as it does not roll over within the sample window) would help with removing noise from the readings.

Right now I am using micros() to time stamp the edges.

On the good side, micros() on the RP2040 seems to not have the 4uS granularity issue that the old AVR architecture had.

On the bad side, the micros() call and value saving is not consistent or fast. Its execution time varies, and will regularly take up to 1.76us to execute. This is more than the timer period !

I'm a little surprised that micros() was not implemented as a direct hardware read of an RP2040 timer register rather than the complex calculations which I think are holdovers from the old AVR architecture*.

Apparently this can be done with some of the generous peripherals on the RP2040 -- but what I had is working and several hours of searching did not produce a clear example of how to do it on my own.

If you have the knowledge how to do that, could you point me in the correct direction?


  • No shade on the AVR, I used of what would later be marketed as a MEGA (128kbytes flash) way back in 1997~1998. One of the first samples I had was an "X" part that had a long list of errata. The AVR was such an amazing processor compared to what else was available.

Ah. As you ask a question it’s no longer fit for showcase and better here for discussions / contributions

Good work!

35 years a speedo was built for a car fueled by petrol. 4 cylinder, 4 stroke having the lowest RPM like 1000 gave 2000 ignition pulses per minute, 2000/60 second.

The descision was 3 Hz display update. That was perfect for the human eye and the human brain. The resolution also didn't have to be 1 RPM. Don't remember what it was.

You use 4 Hz, perfect!

Too often beginners go for as many uppdates, per second, as they can create. The result is reading 8 for the least significant digits, on an LED display, likely the same on LCD.

For a speed control, PID controlled, the demands for precise sampling will be different.

No worries. If I (or the community) can come up with a solution to the micros() fiddly bit then I could make a cleaned up version for the showcase later.

I agree. Also, with this code, you can easily change the sampling window. The only thing it impacts is the lowest speed you can detect.

I spotted a dumb thing I did in the ISR.

I make a decision -- first -- if I should store the micros() reading in the start pulse time or the end pulse time, then I branch -- then -- I call micros().

Better is to capture the time critical micros() first thing, then I branch and store it in the correct variable. D'oh.

//==================================================================================
// This ISR gets hit when a rising edge is received from the shaft rotation encoder.
void ENCODER_RISING_EDGE_ISR(void)
  {
//digitalWrite(DEBUG_PIN, HIGH);
  ISR_micros_snapshot=micros();
  if(0==ISR_edge_count)
    {
    //first sample
    //using micros() seems awkward. It seems like there should
    //be a hardware register I could read directly to get a
    //faster/higher resolution/lower overhead reading from.
    //On the good side, the RP2040 seems to not have the 4uS
    //granularity that the old AVR architecture had.
    //On the bad side, the micros() call and value saving will
    //regularly take up to 1.76us to execute (!!)
    //ISR_first_edge_uS=micros();
    ISR_first_edge_uS=ISR_micros_snapshot;
    }
  else
    {
    //subsequent samples
    //ISR_last_edge_uS=micros();
    ISR_last_edge_uS=ISR_micros_snapshot;
    }
  //remember we captured a sample
  ISR_edge_count++;
//digitalWrite(DEBUG_PIN, LOW);
  }
//==================================================================================

Old code here for reference:

//==================================================================================
// This ISR gets hit when a rising edge is received from the shaft rotation encoder.
void ENCODER_RISING_EDGE_ISR(void)
  {
//digitalWrite(DEBUG_PIN, HIGH);
  if(0==ISR_edge_count)
    {
    //first sample
    //using micros() seems awkward. It seems like there should
    //be a hardware register I could read directly to get a
    //faster/higher resolution/lower overhead reading from.
    //On the good side, the RP2040 seems to not have the 4uS
    //granularity that the old AVR architecture had.
    //On the bad side, the micros() call and value saving will
    //regularly take up to 1.76us to execute (!!)
    ISR_first_edge_uS=micros();
    }
  else
    {
    //subsequent samples
    ISR_last_edge_uS=micros();
    }
  //remember we captured a sample
  ISR_edge_count++;
//digitalWrite(DEBUG_PIN, LOW);
  }
//==================================================================================

and…..in global variables

  volatile uint32_t ISR_micros_snapshot;

LOLz I was raised on assembly language.

The point is to get the algorithm across.

One of the current generation can port it into C++ if it turns out to be useful.

Although -- using a static variable may have performance benefits, since no stack needs to be allocated and no pointers followed. I have not measured that though, so it might not be a factor.

Peace.

I like the algorithm and appreciate your sharing , as I have found the RPM / Tachometer type scripts

generally un-entertaining . Yet to physically try it on ESP32 , which i’ve taken the liberty to convert ( port?) into Arduino type code ( don’t have rasPi stuff ) and too old to learn/change my meagre knowledge .

cheers and thanks ….Ian

This sounds quite high, around 17kHz. Is this a theoretical target or have some way of testing it at that the end of the range? Also what sensor are you using and have you had a look at the output on a scope to see how clean the edges are ?

Yes,1Million RPM is quite ridiculous for any normal macro size object. I have used a signal generator to test the code though:

66,666.667 *60 / 4PPM = 1,000,000 RPM

The crystal on the Adalogger and the time base of the sig gen must be slightly off :slight_smile:

In any case, the 125MHz RP2040 on the AdaLogger appears to operate fine with 66.7KHz ISR hit rate :zany_face:

The fastest I can test with actual spinny things is my ~30K RPM Dremel tool. I have two pieces of reflective tape on the collet. I am using the Seeed Infrared Reflective Sensor:

The waveform seems clean:

I did not get my tape strips perfect, so every other pulse is a different width.

You can get higher pulse rate by using multiple smaller strips, e.g. 1/4th or maybe even 1/8th of the circumference.

Also optimize the dark parts by “paint them black” - black 4.0 would do

(just thinking out loud)

OK, I tested a couple different places for ISR_micros_snapshot and none seemed to make a performance difference. Here is what I have now --- and it is not a (gasp!) global:

// This ISR gets hit when a rising edge is received from the shaft rotation encoder.
// Execution time (inside the brackets - not including call interrupt and return
// overhead) is less than 500nS.
// Latency from the rising edge to inside the brackets is ~1.63uS (there can be
// errors due to digitalWrite() overhead.
void ENCODER_RISING_EDGE_ISR(void)
  {
  register uint32_t
    ISR_micros_snapshot;    
  ISR_micros_snapshot=micros();
  if(0==ISR_edge_count)
    {
    //first sample
    //using micros() seems awkward. It seems like there should
    //be a hardware register I could read directly to get a
    //faster/higher resolution/lower overhead reading from.
    //On the good side, the RP2040 seems to not have the 4uS
    //granularity that the old AVR architecture had.
    //On the bad side, the micros() call and value saving will
    //regularly take up to 1.76us to execute (!!)
    //ISR_first_edge_uS=micros();
    ISR_first_edge_uS=ISR_micros_snapshot;
    }
  else
    {
    //subsequent samples
    //ISR_last_edge_uS=micros();
    ISR_last_edge_uS=ISR_micros_snapshot;
    }
  //remember we captured a sample
  ISR_edge_count++;
  }  

Apparently the register keyword is largely ignored by the compiler -- which likely already puts local variables in a registers anyway.

I was thinking of old school architecture where a stack frame would have to be set up if I used a local. New hardware is great :slight_smile:

Let's take a look at performance.

void ENCODER_RISING_EDGE_ISR(void)
  {
  digitalWrite(DEBUG_PIN, HIGH);
  digitalWrite(DEBUG_PIN, LOW);
  }

Yellow line is the tach signal in (from the signal generator at 66.7KHz).
Blue line is the DEBUG_PIN pulse.

Most of the time the interrupt ISR latency is reasonable and repeatable at ~1.6uS:

But some times it is quite delayed -- up to ~15uS (!):

I assume something on the chip thinks it is more important than my lowly RPM interrupt. Interestingly, this will only impact the reading if it happens on the first or last pulse sampled within the window. Any other pulses will not matter. Even so, difference (~13.5uS) is still quite small compared to the ~500,000uS sample window. Yay for a robust algorithm.

Execution time. So the empty DEBUG_PIN pulse is ~410nS wide:

Compare this to the entire ISR inside the DEBUG_PIN pulse:

void ENCODER_RISING_EDGE_ISR(void)
  {
  digitalWrite(DEBUG_PIN, HIGH);
  register uint32_t
    ISR_micros_snapshot;    
  ISR_micros_snapshot=micros();
  if(0==ISR_edge_count)
    {
    ISR_first_edge_uS=ISR_micros_snapshot;
    }
  else
    {
    ISR_last_edge_uS=ISR_micros_snapshot;
    }
  ISR_edge_count++;
  digitalWrite(DEBUG_PIN, LOW);
  }

Which is ~725nS wide:

So reading micros(), a conditional assignment, and an increment only costs on the order of 315nS. Wow, I love this processor.

As long as I'm in the weeds, lets look at only the micros() call duration. First let's verify the empty DEBUG_PIN pulse again:

void ENCODER_RISING_EDGE_ISR(void)
  {
  register uint32_t
    ISR_micros_snapshot;    
  digitalWrite(DEBUG_PIN, HIGH);
  digitalWrite(DEBUG_PIN, LOW);
  ISR_micros_snapshot=micros();
  if(0==ISR_edge_count)
    {
    ISR_first_edge_uS=ISR_micros_snapshot;
    }
  else
    {
    ISR_last_edge_uS=ISR_micros_snapshot;
    }
  ISR_edge_count++;
  }

Comes out at about the ~410nS as expected:

Now let's capture the micros() call:

void ENCODER_RISING_EDGE_ISR(void)
  {
  register uint32_t
    ISR_micros_snapshot;    
  digitalWrite(DEBUG_PIN, HIGH);
  ISR_micros_snapshot=micros();
  digitalWrite(DEBUG_PIN, LOW);
  if(0==ISR_edge_count)
    {
    ISR_first_edge_uS=ISR_micros_snapshot;
    }
  else
    {
    ISR_last_edge_uS=ISR_micros_snapshot;
    }
  ISR_edge_count++;
  }

Most of the time it is quite reasonable. 640nS - 410nS is a snappy ~230nS:

But some times micros() wanders off into the woods as bad as this post (1.91uS - 410nS = ~1.5uS):

And that I why I was looking for a hardware timer I could read deterministically. Although, compared to to the interrupt latency and sample window it may not be worth going after.

That is enough naval gazing for me. Today.

I've done something similar with my Dremel but I had just a white electrical tape all around the collet and drew multiple lines with a black sharpie so I would get actually lots of on/off for every round.