pulseIn Bug

From the picture (more than good enough) I see the following: correct me if I'm wrong

  • The blue signal looks like it really takes time to get up to its max => looks like a "big capacitor effect". Are you using long/thick cables?
  • The blue signal is 3V. IIRC this is in the undefined range of TTL levels. It just could be that the HIGH level is not seen as HIGH

robtillaart:
From the picture (more than good enough) I see the following: correct me if I'm wrong

  • The blue signal looks like it really takes time to get up to its max => looks like a "big capacitor effect". Are you using long/thick cables?
  • The blue signal is 3V. IIRC this is in the undefined range of TTL levels. It just could be that the HIGH level is not seen as HIGH
  1. You can ignore the ramp up effects...that's a hardware issue in my custom shield that I'm aware of.

  2. Yep, I'm playing with a "black box" that uses 3V logic. I've checked the datasheet for the MEGA2560 (Page 267), and anything over .6*Vcc (3V) is HIGH, so it should be fine, though it is pretty close to the edge. I've got a hardware fix I'm working on to do proper level shifting (would also fix issue in #1), but I haven't bothered with that yet because I've seen this hardware work.

So yes, you are correct on both, but neither actually matter. I've proven, to myself at least, that going from Arduino 1.0 to 0021 fixes the problem, at least at a quick glance. I encourage you to try a similar setup, at 5V. I'd like to try that, but I don't have time to do that right now.

Does that trace show working or non-working code?

The behavior shown there looks correct to me.


Rob

Graynomad:
Does that trace show working or non-working code?

The behavior shown there looks correct to me.

That image shows it working, when using Arduino 0021. If you clicked on the link above, you would see the original post (before I thought this was a bug) that shows the following image of it not working.

The code is the same. The only difference is what compiler I'm using. The bug does not appear to be in the pulseIn function itself, though there are some slight modifications between versions. I suspect the problem lies in interrupt handlers, perhaps for Serial communications, but I don't have the time, or experience, to dig in and find the solution myself.

I hope that the developers responsible for the Arduino IDE see this (or, tell me where their bug tracker is, because I couldn't find one linked on this website), because this seems like a pretty clear bug to me. Behavior should not change depending on the compiler used, especially without giant warnings to let users know of possible issues.

I suspect the problem lies in interrupt handlers,

An interrupt at the wrong moment can indeed "ruin" measurements, but is your code receiveing bytes while measuring? If no then there are no interrupts...

robtillaart:

I suspect the problem lies in interrupt handlers,

An interrupt at the wrong moment can indeed "ruin" measurements, but is your code receiveing bytes while measuring? If no then there are no interrupts...

Nothing else is happening when I am trying to take these measurements, but there is an open serial port (using the Serial commands).

For all intents and purposes, this appears to be resolved on my part by moving to an older Arduino IDE. I do apologize for being a "this isn't my problem" type of person, but I simply don't have time to troubleshoot this. All I can say is I've found what appears to be a regression bug, and I want to make sure the right people know.

iwoloschin:
For all intents and purposes, this appears to be resolved on my part by moving to an older Arduino IDE. I do apologize for being a "this isn't my problem" type of person, but I simply don't have time to troubleshoot this. All I can say is I've found what appears to be a regression bug, and I want to make sure the right people know.

But until someone else can reproduce the issue you have reported, nothing will happen.

Iain

sixeyes:

iwoloschin:
For all intents and purposes, this appears to be resolved on my part by moving to an older Arduino IDE. I do apologize for being a "this isn't my problem" type of person, but I simply don't have time to troubleshoot this. All I can say is I've found what appears to be a regression bug, and I want to make sure the right people know.

But until someone else can reproduce the issue you have reported, nothing will happen.

Iain

I've encouraged others to do so. I believe it should be easy enough to do, though you may need two Arduinos (or a function generator) since you won't be able to read a pulse while triggering another one, though you might be able to do something with interrupts, but that might have unintended results. Voltage shouldn't matter, so long as your LOW is under 0.1Vcc (500 mV) and your HIGH is over 0.6Vcc (3 V). I suspect it won't take long to do a quick test, but it would take some effort to properly regression test this. I've already done my "quick" testing, and I don't have the time, or knowledge of the Arduino IDE's guts, to do proper regression testing.

iwoloschin:
I just started marching backwards through the Arduino IDEs, starting at 1.0. I noticed this error in everything newer than 0021.

From the Release Notes:
ARDUINO 0022 - 2010.12.24

  • Applying the timeout parameter of pulseIn() during measurement of the pulse, not just while waiting for it.

What value is your timeout (usLong)? Perhaps you are encountering timeout errors.

I thought that might have something to do with it too, but I don't know why that would matter. My timeout (usLong) is set to 100 mS, or 10,000 uS. According to the reference page*, it needs to be between 10 uS and 3 minutes (1.8 x 108 uS), and it also says the following:

Reads a pulse (either HIGH or LOW) on a pin. For example, if value is HIGH, pulseIn() waits for the pin to go HIGH, starts timing, then waits for the pin to go LOW and stops timing. Returns the length of the pulse in microseconds. Gives up and returns 0 if no pulse starts within a specified time out.

I don't see why that change should be affecting this, as the pulses I'm measuring are 2 orders of magnitude below my timeout period.

*pulseIn() - Arduino Reference

My bugreport - Google Code Archive - Long-term storage for Google Code Project Hosting. - included math problems due to macro expansion with timeout too ... But the effect was that "long" timeouts went off too early due to overflow, not that it gave too large values....

robtillaart:
My bugreport - Google Code Archive - Long-term storage for Google Code Project Hosting. - included math problems due to macro expansion with timeout too ... But the effect was that "long" timeouts went off too early due to overflow, not that it gave too large values....

It looks like pulseIn is reporting the correct pulse length, just not returning right away which causes it to miss the next (closely, but not too closely) spaced pulse.

That could be due to unoptimized math happening at the end, I haven't really looked into that though.

When I look at your scope captures, I do not see a bug in pulsein() at all.
It sure looks to me like pulsein() is working exactly as it is supposed to
in both the "working" picture and the "non working" pictures.
(the problem appears to be elsewhere)

From what I can tell it looks like the Red wave is being set high just before pulsein() is called
and lowered just after pulsein() returns and the blue one is the signal being timed with pulsein().

Note: Given that there a different number of red waves in the two pictures,
and the timing in between red pulses (calls to pulsein() ), is substantially different,
it leads me to question whether the two pictures were actually taken
with identical sketches.
Ignoring that for a moment.......

You can see that in the "non working" picture, the red wave rises later - much later
than in the "working" picture.
So that in the "non working" case, the 2nd pulsein() appears to start when the wave to be timed
is low and since it has been told to time a LOW pulse, it must now wait for it to go high and then low again.
And if you look at when the RED wave drops, I don't not see a very long amount of time
after the waves goes high after the timing of the low pulse ends.
It looks to be about the overhead of digitalWrite().
And the amount of time seems to be consistent with the first pulsein().

BTW. For any serious type of timing you don't want to use digitalWrite() as it is WAY too slow.
Use direct port i/o. The overhead will reduce from close to 6us to 125ns to strobe a pulse.

But overall, it looks to me like pulsein() is working just fine.

The question is what is the code doing after the first pulsein() returned which must happen
in order to set the red wave low and the setting of the red wave high which is done
just before the second call to pulsein()?
This is code that is outside pulsein().

From what I see in the pictures, it appears that the code used to create the waves
in the two pictures is not the same, either that or something kicks in after the first pulse and is
hammering the processor with interrupts and slowing it down.

--- bill

I think you are correct bperrybap, it appears that the digitalWite() is taking too long and the next pulseIn() occurs in a LOW and therefore has to wait.

Has digitalWrite() changed between versions?

Using direct port writes should prove that one way or another.


Rob

Graynomad:
I think you are correct bperrybap, it appears that the digitalWite() is taking too long and the next pulseIn() occurs in a LOW and therefore has to wait.

Has digitalWrite() changed between versions?

Using direct port writes should prove that one way or another.


Rob

Yes digitalWrite() has changed.
In fact I pushed the Arduino team to change it because as it was it would corrupt data bits in a port
when pins were shared between an ISR and normal code on the same port.
It took about a year to convince them that they needed to make the change.
http://code.google.com/p/arduino/issues/detail?id=146
In 0019 digitalWrite() and pinMode() were changed to save the SR, mask interrupts and then restore the SR
to set/clear bits.

But I don't believe that is the issue here.
If you look at the graphs, they do not seem to indicate this.
In both the working and non working times, the first pulsein() works just fine.
It is on the "non working" graph that you see a very long pause between the dropping of Y/RED
and the rising of Y/RED. It looks like several hundred microseconds.
That is way longer than the few hundred nanoseconds that was added to mask and restore interrupts.

--- bill

Again, for those of you who doubt me, I ask you to try playing with this yourself. As I've said, I'm pretty sure the voltage doesn't matter, a HIGH is anything over 3V.

The sketch was used for both images, some stuff may have changed elsewhere, but not this part. Additionally, 6uS for digitalWrite is still fast enough that it pulseIn should be triggering properly, as there's at least 40 uS between pulses.

The only thing that changed, that I can find a difference for, is going from Arduino 1.0 back to Arduino 0021. Again, I urge someone with more experience with the Arduino environment to regression test this. If you're unwilling, well, that's kind of poor form. If you're unable, at least say so. If you're convinced I'm doing something wrong, I ask that you "prove it" to me by actually running a test. I've done my testing, on several different Arduinos and found a workable, though unexplained solution. The engineering side of me would love to know why there's a difference like this, but on the other hand, if the Arduino community doesn't seem to want to test this...I'm not going to waste any more of my time on it.

At the moment I am unable (resources) to test. To confirm the bug we need two sketches - one generating a pulsetrain - the other measuring it - to confirm the bug. Something like this:

simplest pulse generating sketch

#define PULSEPIN 5

unsigned long timeHigh = 100;  // to be tweaked to take digitalWrite() into account
unsigned long timeLow = 100;

void setup()
{
  Serial.begin(115200);
  Serial.println("Arduino Pulse Generator V0.1");
  Serial.print("micros  LOW: ");
  Serial.println(timeLow);
  Serial.print("micros HIGH: ");
  Serial.println(timeHigh);

  pinMode(PULSEPIN, OUTPUT);
  while(1)
  {
    digitalWrite(PUSLEPIN, HIGH);
    delayMicroseconds(timeHigh);
    digitalWrite(PUSLEPIN, LOW);
    delayMicroseconds(timeLow);  
  }
}
void loop(){}

simplest pulse measuring sketch

#define PULSEPIN 5

unsigned long duration; 
unsigned long timeLow;

void setup()
{
  Serial.begin(115200);
  Serial.println("Arduino Pulse Measurement V0.1");

  pinMode(PULSEPIN, OUTPUT);
  while(1)
  {
    duration = pulseIn(PUSLEPIN, HIGH);
    Serial.print("Duration: ");
    Serial.println(duration);
  }
}
void loop(){}

iwoloschin:
My timeout (usLong) is set to 100 mS, or 10,000 uS.

Which is it? 100 mS (100,000 uS) or 10,000 uS (10 mS). Perhaps you should show the code.

Ok, so I'll admit that I initially misunderstood the transitions of your Y/RED wave based on the sample
code you showed us.
And that the transitions are on each call to pulsein() vs measuring the time around each pulsein(),
my bad on that, however, I don't think it changes the interpretation of what is happening.

I guess where I'm at is that the graphs in the pictures you've shown so far do not seem indicate a bug with pulsein().
Those graphs appear to show pulsein() working just as it should - but.....
do seem to indicate that there are certain uses of pulsein() that worked in the past that may have
have issues with the new pulsein() now that the timeout mechanism was added to the pulse measurement in 0022.

My suspicion at this point it that, that are are 2 things that are contributing to your measurement
code not working as anticipated in the newer releases, and while it may make
your previously working implementation fail to work, unfortunately I wouldn't necessarily call it a bug in pulsein().
It is more of case where behavior has changed.

Take a look at the pictures you've provided.
While its a bit hard to see given the scale,
would you agree that the "non working" pulesin() example appears to show that the 2nd pulsein()
(as indicated but the first falling edge of Y) was just a bit later than
than the 2nd pulsein() in the "working" example based?

The width of the first red pulse is just a bit wider in the "non working" case.

In the "working" picture you can see that X is high is when Y/RED drops and that it is near
the front part of the high part of the X pulse.
In the "non working" picture, it appears that X is already low or just about to be low
when Y/RED drops.

If you don't agree, I'd like to understand the different interpretation.

Since there is some additional overhead between Y/RED dropping and pulsein() being called much
less being able to time the desired pulse, it appears in the "non working" example that the call to pulsein()
is not soon enough to to catch the first low pulse so it must wait for the 2nd low pulse.

So while the "non working" example isn't measuring the pulse you want, the picture
seems to show pulsein() measuring the first pulse it could given when pulsein() is being called.

So the real question becomes, why is the second call to pulsein() delayed in the non working case?

I can think of 2 off hand:

  • using digitalWrite()
  • pulsein() changed between 0021 and 0022 (comment shows granularity of width measurement doubled)

Lets look at each one:


digitalWrite()

Like I said before using digitalWrite() is not a good thing to use for any type of critical timing measurement
as it inserts overhead. LOTS of overhead vs doing direct port i/o.
Just because the digitalWrite() overhead is much shorter than your 40us period between pulses means nothing.
It could be that there is already 35 us of overhead before the pulsein() call is actually to the point in its internal code
of being able to start watching the signal line. In that case if digitalWrite() adds more than 5us of additional
overhead, your measuring code "breaks".
Keep in mind that when you see the output signal change, that there is still additional timing
overhead before the digitalWrite() returns.
digitalWrite() changed in 0019 to prevent port bit corruption when using digitalWrite() in ISRs
(I verfied that it was in 0019), That change makes digitalWrite() a little bit slower and also adds additional
overhead between the signal changing and the actual return of digitalWrite().
So after 0018 not only did digitalWrite() get a bit slower, but the latency between when a signal changes
and when the digitalWrite() returns has increased slightly.

My advice: NEVER, NEVER, NEVER use it for things like this.

pulsein() code changing

This is more than likely the biggest contributor to the failure.
The measurement code in pulsein() was changed in between 0021 and 0022
From this:

	// wait for the pulse to stop
	while ((*portInputRegister(port) & bit) == stateMask)
		width++;

	// convert the reading to microseconds. The loop has been determined
	// to be 10 clock cycles long and have about 16 clocks between the edge
	// and the start of the loop. There will be some error introduced by
	// the interrupt handlers.
	return clockCyclesToMicroseconds(width * 10 + 16);

to this:

	// wait for the pulse to stop
	while ((*portInputRegister(port) & bit) == stateMask) {
		if (numloops++ == maxloops)
			return 0;
		width++;
	}

	// convert the reading to microseconds. The loop has been determined
	// to be 20 clock cycles long and have about 16 clocks between the edge
	// and the start of the loop. There will be some error introduced by
	// the interrupt handlers.
	return clockCyclesToMicroseconds(width * 21 + 16);

It was to add the timeout to the pulse measurement itself, to prevent the code
from hanging during pulse measurement.

While that seems innocent enough, it does affect how quickly pulsein() "notices" the pulse changing
which affects the granularity of the measurement and the latency before the pulsein() call returns.
One thing that concerns me is that the comment does not match the code.
comment says 20 and code uses 21 this extra +1 is not consistent with the previous loop
count of 10.
Regardless, the impact of the additional code is that it takes longer for the code to recognize a
transition. If it really is an additional 11 cycles over the previous code,
that is 11*62.5ns (using a 16mz clock) or 687.5ns is added to the granularity.
So while the old loop had a granularity of 625ns, the new loop has a granularity of 1312ns.
This means that you could potentially get stuck for up to an additional ~ 1.3us longer than
the real pulse length.


My guess as to what is happening here is that the new pulsein() granularity due to the timeout
capability combined with the dititalWrite() overhead, is creating just enough latency for the the 2nd pulsein()
call to just barely miss the start of desired pulse.

While technically, it isn't a "bug" it sure can cause previously working code to fail.

The documentation for pulseIn() really doesn't say how it handles the timeout of the
pulse itself. In fact the documentation on pulsein() currently clearly states that the timeout
is for waiting for the pulse to start and not for the pulse itself.

Having a granularity of 21 cycles seems like a lot and seems like it has the potential to break things,
like it seems to be doing in this case.
At a minimum, it creates additional error in the measurements.

Somebody thought that having a timeout for the actual pulse measurement was a good
thing and it got added in. However, it came with the cost of reducing the accuracy
of the measurements and everyone is now forced to live with this even if they
intentionally don't specify a timeout.

I guess the pulsein() code could be smarter when not using a timeout
so that it branched around to a different set of code that didn't do the timeout checks
vs just creating a really long timeout.
This would give better granularity and lower latencies with added risk of getting stuck
waiting for the signal to change.

pulseIn() could also be extended to have two timeouts,
one for the amount of time waiting for the pulse and one for the pulse itself to allow
the user to have better control over the timeouts.
i.e.

pulseIn(pin, value);
pulseIn(pin, value, waitforpulsetimeout);
pulseIn(pin, value, waitforpulsetimeout, pulselengthtimeout);

and then be smart enough to use code that does not do the checks when timeouts are not specified
so that better granularity and accuracy could be achieved.

I think going with double timeouts would offer the best flexibility and preserve the old
(pre 0022) behavior while adding the option for a timeout on the actual pulse measurement
all at the option of the user.
But the code has to be re-written to be smarter or potentially simply have 3 different routines
so that when timeouts are not specified, the pin monitoring loops are tighter.

--- bill

100 mS, I must've missed a 0 when typing in the uSeconds.

I'm not sure I can actually post the code as it's not mine.