While loop preventing ISR from working

I am having some difficulty identifying the problem with this library I am developing… It appears to me that when this code in the library gets executed

while (WDT_POOL_COUNT < 1)
    waiting += 1; // Just a dummy until ISR creates entropy pool

it never exits, which it should when the ISR puts some data back into the entropy pool.

Using this sketch, which triggers the above while loop immediately, it never sends any data to the PC,

#include <Entropy.h>

void setup()
{
  Serial.begin(115200);
  Entropy.Initialize();
}

void loop()
{
    Serial.println(Entropy.random());
}

However, the following sketch, which ensures that the entropy pool contains data before calling the random function and therefore never enters the while loop works fine, reporting a series of random values…

#include <Entropy.h>

void setup()
{
  Serial.begin(115200);
  Entropy.Initialize();
}

void loop()
{
   if(Entropy.avalable()>1)
    Serial.println(Entropy.random());
}

I have created a standalone c++ file that includes the library and replicates the function of the first sketch and run it with the simulator in AVR studio, and get the same result… For some reason when the while loop is executed, the ISR never gets triggered… Any ideas?

Here is the code for the library…

Entropy.h

#ifndef Entropy_h
#define Entropy_h

#include <stdint.h>
#include <avr/interrupt.h>
#include <avr/wdt.h>

const uint32_t WDT_RETURN_BYTE=256;
const uint32_t WDT_RETURN_WORD=65536;

union ENTROPY_LONG_WORD 
{
  uint32_t int32;
  uint16_t int16[2];
  uint8_t int8[4];
};

class EntropyClass
{
public:
  void Initialize(void);
  uint32_t random(void);
  uint8_t available(void);
 private:
  ENTROPY_LONG_WORD share_entropy;
  uint32_t retVal;
};
extern EntropyClass Entropy;
#endif

Entropy.cpp

#include <Entropy.h>

const uint8_t WDT_BUFFER_SIZE=32;
const uint8_t WDT_POOL_SIZE=8;
uint8_t WDT_BUFFER[WDT_BUFFER_SIZE];
uint8_t WDT_BUFFER_POSITION;
uint8_t WDT_POOL_START;
uint8_t WDT_POOL_END;
uint8_t WDT_POOL_COUNT;
uint8_t WDT_LOOP_COUNTER;
uint32_t WDT_ENT_POOL[WDT_POOL_SIZE];

// This function initializes the global variables needed to implement the circular entropy pool and
// the buffer that holds the raw Timer 1 values that are used to create the entropy pool.  It then
// Initializes the Watch Dog Timer (WDT) to perform an interrupt every 2048 clock cycles, (about 
// 16 ms) which is as fast as it can be set.
void EntropyClass::Initialize(void)
{
  WDT_BUFFER_POSITION=0;
  WDT_POOL_START = 0;
  WDT_POOL_END = 0;
  WDT_POOL_COUNT = 0;
  cli();                         // Temporarily turn off interrupts, until WDT configured
  MCUSR = 0;                     // Use the MCU status register to reset flags for WDR, BOR, EXTR, and POWR
  WDTCSR |= _BV(WDCE) | _BV(WDE);// WDT control register, This sets the Watchdog Change Enable (WDCE) flag, which is  needed to set the 
  WDTCSR = _BV(WDIE);            // Watchdog system reset (WDE) enable and the Watchdog interrupt enable (WDIE)
  sei();                         // Turn interupts on
}

// This function returns a uniformly distributed random integer in the range
// of [0,0xFFFFFFFF] as long as some entropy exists in the pool and a 0
// otherwise.  To ensure a proper random return the available() function
// should be called first to ensure that entropy exists.
//
// The pool is implemented as an 8 value circular buffer
uint32_t EntropyClass::random(void)
{
  uint8_t waiting;
  while (WDT_POOL_COUNT < 1)
    waiting += 1; // Just a dummy until ISR creates entropy pool
  retVal = WDT_ENT_POOL[WDT_POOL_START];
  WDT_POOL_START = (WDT_POOL_START + 1) % WDT_POOL_SIZE;
  --WDT_POOL_COUNT;
  return(retVal);
}


// This function returns a unsigned char (8-bit) with the number of unsigned long values
// in the entropy pool
uint8_t EntropyClass::available(void)
{
  return(WDT_POOL_COUNT);
}

// This interrupt service routine is called every time the WDT interrupt is triggered.
// With the default configuration that is approximately once every 16ms, producing 
// approximately two 32-bit integer values every second. 
//
// The pool is implemented as an 8 value circular buffer
ISR(WDT_vect)
{
  WDT_BUFFER[WDT_BUFFER_POSITION] = TCNT1L; // Record the Timer 1 low byte (only one needed) 
  WDT_BUFFER_POSITION++;                    // every time the WDT interrupt is triggered
  if (WDT_BUFFER_POSITION >= WDT_BUFFER_SIZE)
  {
    WDT_POOL_END = (WDT_POOL_START + WDT_POOL_COUNT) % WDT_POOL_SIZE;
    // The following code is an implementation of Jenkin's one at a time hash
    // This hash function has had preliminary testing to verify that it
    // produces reasonably uniform random results when using WDT jitter
    // on a variety of Arduino platforms
    for(WDT_LOOP_COUNTER = 0; WDT_LOOP_COUNTER < WDT_BUFFER_SIZE; ++WDT_LOOP_COUNTER)
      {
	WDT_ENT_POOL[WDT_POOL_END] += WDT_BUFFER[WDT_LOOP_COUNTER];
	WDT_ENT_POOL[WDT_POOL_END] += (WDT_ENT_POOL[WDT_POOL_END] << 10);
	WDT_ENT_POOL[WDT_POOL_END] ^= (WDT_ENT_POOL[WDT_POOL_END] >> 6);
      }
    WDT_ENT_POOL[WDT_POOL_END] += (WDT_ENT_POOL[WDT_POOL_END] << 3);
    WDT_ENT_POOL[WDT_POOL_END] ^= (WDT_ENT_POOL[WDT_POOL_END] >> 11);
    WDT_ENT_POOL[WDT_POOL_END] += (WDT_ENT_POOL[WDT_POOL_END] << 15);
    WDT_ENT_POOL[WDT_POOL_END] = WDT_ENT_POOL[WDT_POOL_END];
    WDT_BUFFER_POSITION = 0; // Start collecting the next 32 bytes of Timer 1 counts
    if (WDT_POOL_COUNT == WDT_POOL_SIZE) // The entropy pool is full
      WDT_POOL_START = (WDT_POOL_START + 1) % WDT_POOL_SIZE;  
    else // Add another unsigned long (32 bits) to the entropy pool
      ++WDT_POOL_COUNT;
  }
}

EntropyClass Entropy;

Any global variable referenced within an ISR must be declared "volatile":

volatile uint8_t WDT_POOL_COUNT;

Otherwise the compiler's optimizer will break your code. It will end up as:

while(1)
    waiting += 1; // Just a dummy until ISR creates entropy pool

because it will think that WDT_POOL_COUNT never changes (it doesn't know what the ISR does or when it does it).

Specifying "volatile" tells the compiler not to optimize this variable, and to never trust what is cached in a register.

Yes, it makes it a bit slower, but it's necessary.

majenko: Any global variable referenced within an ISR must be declared "volatile":

...and access must be protected with a critical section.

Can you expound on what you mean with this? When I made the appropriate global variables volatile, my code seems to function as expected. Given the nature of the library (produces random numbers) it doesn’t
seem to matter if any of the global variables (or portions thereof) get altered while the library is accessing them.

wanderson: Can you expound on what you mean with this?

http://arduino.cc/forum/index.php?topic=45239.0 http://gammon.com.au/interrupts http://www.nongnu.org/avr-libc/user-manual/group__avr__interrupts.html https://www.google.com/search?q=avr+interrupts

When I made the appropriate global variables volatile, my code seems to function as expected.

Are you claiming critical sections are not necessary?

Given the nature of the library (produces random numbers) it doesn't seem to matter if any of the global variables (or portions thereof) get altered while the library is accessing them.

That kind of thinking will get you: corrupt memory, lock-ups, "random" behaviour, and a long list of other undesirable "features".

Thank you I will read these links ASAP.

No, I said exactly what I meant to say, which is that the code seems to be running correctly without “critical sections”.

Since I will need to run this code for continuously for several days, if I get any “random” behavior, I will add the “critical sections” you mention; however, what you didn’t answer is how
the lack of “critical sections” would affect THIS library, which has a ISR that isn’t really using pointers (mostly) and where it doesn’t matter if in reading a multi-byte value, one or more of the bytes
being read change between reads. The individual bytes are not pointers to anything, they are simply random… therefore they don’t matter if they change mid-read.

Further if a different ISR is called, it shouldn’t be accessing the same globals, since they are not declared outside of the c++ file of the library.

If I am wrong about these assumptions, please provide either an explanation or some direction as to what I can read to obtain that explanation (other than the links you already provided).

wanderson: however, what you didn't answer is how the lack of "critical sections" would affect THIS library

And I will not be. Failure analysis on interrupt service routines can be very time consuming. It is also unnecessary. The solution is trivial: add the critical sections.

If I am wrong about these assumptions...

At least one assumption you made is wrong.

please provide either an explanation...

No. You have the solution. It's trivial to add and the cost is negligible.

[quote author=Coding Badly link=topic=108108.msg812101#msg812101 date=1338495217]

wanderson: however, what you didn't answer is how the lack of "critical sections" would affect THIS library

And I will not be. Failure analysis on interrupt service routines can be very time consuming. It is also unnecessary. The solution is trivial: add the critical sections. [/quote]

Your proposed solution may well be trivial, but I was asking because I was curious about WHY it was necessary for this situation... If you don't wish to answer that, fine.

[quote author=Coding Badly link=topic=108108.msg812101#msg812101 date=1338495217]

If I am wrong about these assumptions...

At least one assumption you made is wrong. [/quote]

Not a particularly helpful answer. Like the suggestions that people asking questions provide complete questions, the same principal applies to answers. I fully understand the lack of interest in spending time answering what you perceive to be a stupid question; however, this particular answer would have been better left unmade (and that certainly would have been more polite)

[quote author=Coding Badly link=topic=108108.msg812101#msg812101 date=1338495217]

please provide either an explanation...

No. You have the solution. It's trivial to add and the cost is negligible. [/quote]

As I said I will read the links you provided; however, a quick perusal indicated that they were describing a general set of circumstances, which may (or may not) affect the specific conditions here. As I said, if this sketch runs for several days, without incidence, I am less than convinced that the cost you propose is needed... At the very least I am interested in learning what circumstances trigger the need for the solution you proposed.

If I am wrong about these assumptions... At least one assumption you made is wrong. ... Not a particularly helpful answer.

The answer was meant to force you to consider the assumptions you made. If you could not determine which one is wrong then you would be (theoretically) encouraged to include critical sections. Obviously, from what you later wrote, the attempt failed.

Like the suggestions that people asking questions provide complete questions, the same principal applies to answers.

I gave you the complete answer: add critical sections.

I fully understand the lack of interest in spending time answering what you perceive to be a stupid question;

It's not a stupid question. But it is a question that has been asked and answered hundreds of times before by people much smarter than you and I. Except in extremely rare cases the answer is always the same.

As I said I will read the links you provided; however, a quick perusal indicated that they were describing a general set of circumstances, which may (or may not) affect the specific conditions here. As I said, if this sketch runs for several days, without incidence, I am less than convinced that the cost you propose is needed...

After a "quick perusal", you are convinced that you are right and I am wrong. If that's the amount of effort you are going to exert before reaching a conclusion I give up. Do what you want @wanderson. This conversation is over for me.

No I am not convinced you are wrong and I am right. I just don’t believe that you are undoubtedly right, which you seem to believe. I have been involved in one other thread with you where you made a similar blanket claim, which further research revealed to be unfounded–or at least without the certainty which you claimed… Further, many of your answers are at best terse, and at worst, not particularly salient. I fully understand how irritating some of the questions asked on a board like this can be. Indeed, I should have been able to find the need for the volatile keyword, not sure why I didn’t find it when I searched–probably because I was looking for the wrong key phrases. But, terse answers, while justifiable, are no more helpful that short incomplete questions. For instance, instead of stating that one of my assumptions is wrong an equally terse "search for " would actually have been helpful.

I have now read all of the links you provided and am still of the opinion that your “solution” may not be necessary in this case. Specifically, all but one of the accesses are atomic (though I will need to look at the generated asm to confirm this), and the one that isn’t simply doesn’t matter if portions of the value change in the middle of access. Further, even if you are correct, my interest is not in simply producing a working library that uses an interrupt, but rather in learning the specifics of interrupt handling on AVR’s which is an entirely different kettle of fish. In other words the WHY something needs to be done, not simply the HOW.

Children, please, take this outside.

It's not a question of who is right and who is wrong.

It's a case of erring on the side of caution.

wanderson, it may well be the case that your library will not have any noticeable problems by omitting the safety measures. However, you cannot be 100% certain, even after it has been running for a few days. Have it run for a few decades and then you are almost certain. Yes, by the way you describe your library it is highly likely that there won't be any noticeable problems.

However, it is a good habit to get in to - be safe rather than be sorry. Your next ISR may not be as forgiving, and if you are already in the habit of disabling interrupts when accessing ISR modified variables outside the ISR, then you won't get bitten by it.

Coding Badly, I understand your frustration, but remember, a huge number of people in this community aren't native English speakers, so misunderstandings can occur.

Also, remember that no one solution is ever 100% correct. There are other ways of avoiding variable corruption with ISRs, such as mutex locks. So although yes, a lot of the time this is an issue, it isn't always an issue, and the disabling of interrupts isn't always the right way to go about it.

majenko: Children, please, take this outside.

It's not a question of who is right and who is wrong.

It's a case of erring on the side of caution.

wanderson, it may well be the case that your library will not have any noticeable problems by omitting the safety measures. However, you cannot be 100% certain, even after it has been running for a few days. Have it run for a few decades and then you are almost certain. Yes, by the way you describe your library it is highly likely that there won't be any noticeable problems.

However, it is a good habit to get in to - be safe rather than be sorry. Your next ISR may not be as forgiving, and if you are already in the habit of disabling interrupts when accessing ISR modified variables outside the ISR, then you won't get bitten by it.

Coding Badly, I understand your frustration, but remember, a huge number of people in this community aren't native English speakers, so misunderstandings can occur.

Also, remember that no one solution is ever 100% correct. There are other ways of avoiding variable corruption with ISRs, such as mutex locks. So although yes, a lot of the time this is an issue, it isn't always an issue, and the disabling of interrupts isn't always the right way to go about it.

majenko,

I understand that running a few days does not guarantee I will not have a problem. However, I am not interested in simply making a "bulletproof" library. As I said I am interested in the WHY such code is needed. Simply including it because it is safe, is not what I am looking for. I am interested in what conditions trigger the need for it--then I can decide if it is needed for this library. And barring such an understanding, leaving the code as is and running it, hoping to generate the errors that I can then diagnose will be the only way I can gain that understanding--at least it appears to be.

Oh, and as an aside I am a native English speaker (American English). I understood what CB said, at least where he made complete statements, my problem was that the partial statements weren't helpful. I can fully understand the frustration with the nature of questions asked on this board; however, I believe a better solution is not to answer rather than making terse, unspecific answers. In the case of topics that may have been covered in detail before, it is help to provide (an equally terse) "Search for keywords". Many times the OP is not lazy, in that they have searched, but rather they have not been using the correct keywords. That was certainly my situation since I was focusing on searching for why the while loop wasn't changing values and knew that it was related to optimization, but not that it was the global variable/interrupt that was causing the problem.

volatile uint8_t WDT_POOL_COUNT;
...
while (WDT_POOL_COUNT < 1)
    waiting += 1; // Just a dummy until ISR creates entropy pool

I don’t see why, in this particular case, a critical section is needed. The access to a byte variable will be atomic, and we aren’t trying to change it.

Turning off interrupts has its own cost (the next ISR might be delayed).

I agree with Coding Badly that you need to carefully consider access to variables which are updated by ISRs, and for multi-byte variables, or variables which you are planning to change, a critical section (ie. turning off interrupts) will almost certainly be necessary.

However not in this case.

wanderson, if you haven't read the links provided above, you really owe it to yourself to do that. I'll add one more: http://www.nongnu.org/avr-libc/user-manual/group__util__atomic.html

But here's the deal.

When sharing a variable between the main code and an ISR, if any operation being done on the variable is interrupted in the middle, the ISR could update the variable in the middle of that operation, and then the main would continue its update after the ISR completes. This will result in a partial update, and the variable's value will be corrupted.

Many operations take multiple machine instructions and many variables are multi-byte, so even a seemingly simple operation like an increment or assignment can require multiple instructions that, if not executed "atomically" (i.e. with interrupts inhibited), risk data corruption.

Even worse, it's timing related. So any given code XD could run for a long time and "seem" to be OK. Sooner or later though, as sure as the sun rises, Mr. Murphy will get in there and cause trouble. And when that happens, as CB said, it is exceedingly difficult to debug, and not just because it'll be very difficult to reproduce the symptom.

So this is quite a dogmatic thing. [u]Always[/u] declare variables shared with an ISR as volatile, and [u]always[/u] access them atomically. Period. No exceptions. It's The Right Thing To Do.

Yes, Jack, but reading one byte is an atomic operation. You don't need to put a critical section around that. If the interrupt occurs before the read you get the new value, if it happens after the read, well that would have happened anyway with a critical section. It can't happen during the read.

Just to be clear, this could be bad:

WDT_POOL_COUNT++;

That is because that both reads and writes.

And BTW @op - please don't name variables in all caps. Reserve that for constants.

Can’t disagree, but I might do it anyway just on the off chance that I’d come back some day and make it something other than a single byte data type.

As for the overhead involved in establishing an atomic block, I’d have to first be convinced that the overhead (which should be pretty small in most cases) was indeed likely to cause problems. And if the answer was yes, that’d indicate some pretty busy ISRs, and then I’d think the whole works would be on shaky ground as a result. But that’s just me. Normally I go for efficiency but this might be an exception, I’d have to have a reason not to.

[quote author=Jack Christensen link=topic=108108.msg812343#msg812343 date=1338510577] wanderson, if you haven't read the links provided above, you really owe it to yourself to do that. I'll add one more: http://www.nongnu.org/avr-libc/user-manual/group__util__atomic.html

But here's the deal.

When sharing a variable between the main code and an ISR, if any operation being done on the variable is interrupted in the middle, the ISR could update the variable in the middle of that operation, and then the main would continue its update after the ISR completes. This will result in a partial update, and the variable's value will be corrupted.

Many operations take multiple machine instructions and many variables are multi-byte, so even a seemingly simple operation like an increment or assignment can require multiple instructions that, if not executed "atomically" (i.e. with interrupts inhibited), risk data corruption.

Even worse, it's timing related. So any given code XD could run for a long time and "seem" to be OK. Sooner or later though, as sure as the sun rises, Mr. Murphy will get in there and cause trouble. And when that happens, as CB said, it is exceedingly difficult to debug, and not just because it'll be very difficult to reproduce the symptom.

So this is quite a dogmatic thing. [u]Always[/u] declare variables shared with an ISR as volatile, and [u]always[/u] access them atomically. Period. No exceptions. It's The Right Thing To Do. [/quote]

Thank you for the link, it was useful information.

I think I understand now WHY the data would normally need to be protected; however, I do not believe that this is a case where that is true--at least until I examine the assembly code produced by my code.

In my case there are four variables associated with a circular buffer that has data added to it by the ISR and then has data retrieved from it by an external class method. There are only two of the four variables that are accessed by both the class method and the ISR. The first of these is a counter that is implemented as a 8 bit unsigned integer, and hence inherently "atomic" if I understand the process correctly. Th other variable is a 32-bit unsigned integer. Now I understand that it impossible that an operation on a 32-bit integer can not be accomplished atomically on an 8-bit cpu; however, in my case this value is a true random sequence of four bytes. So if one or more of the bytes are replaced by a different set of four random bytes by the ISR in the middle of the retrieving the original value, the only effect will be that the class method will return a different random value than it would have originally (or if it had been called after the ISR updated the values).

Now the other two variables are also unsigned 8-bit integers, but that also doesn't matter in this case, since the class method only uses and changes the pointer to the start of the buffer, and the ISR only uses and changes the pointer to the end of the buffer. So as I was saying, with these specific conditions, I do not believe there is any possibility of any of the buggy behavior CB described. The only problem that I can see from the class perspective is that the mixing of the one 32-bit integer could possibly introduce a bias into the production of random numbers. And that is something that is best determined empirically. Now as I said, after examining the asm produced, I may see something that alters my current belief, or if I see some unexpected behavior I would then add the suggested protection routines and see if the unexpected behavior goes away.

Also if I were producing commercial code and the resources were not too tight, I would have taken CB's suggestion to protect the data just as a typicall "best practice"; however, for me this is part of learning exercise, and the real goal (since I really don't need this library) is to UNDERSTAND the process in detail.

BTW, This library is designed to provide a source of true hardware generated random numbers using the timer jitter associated with the Watch Dog Timer and Timer 1

[quote author=Nick Gammon link=topic=108108.msg812351#msg812351 date=1338511242] Just to be clear, this could be bad:

WDT_POOL_COUNT++;

That is because that both reads and writes.

And BTW @op - please don't name variables in all caps. Reserve that for constants. [/quote]

Thanks, but this variable is an uint8_t which I believe would be handled atomically, correct?

The caps is the style I always used when dealing with global variables/constants the last time I used C++ (since I used the const keyword on constants), what is the accepted practice for the arduino community to distinguish a global variable and a local variable?

wanderson: Thanks, but this variable is an uint8_t which I believe would be handled atomically, correct?

No.

Take this sketch:

volatile byte foo;

void setup ()
{
  foo++;
}
void loop () {}

Generated code for setup:

void setup ()
{
  foo++;
  a6:  80 91 00 01     lds r24, 0x0100
  aa:  8f 5f        subi    r24, 0xFF   ; 255
  ac:  80 93 00 01     sts 0x0100, r24
}
  b0:  08 95        ret

If, after the lds, but before the sts, the ISR changed foo, then the wrong value is incremented, and the change made by the ISR is discarded.

what is the accepted practice for the arduino community to distinguish a global variable and a local variable?

I don't know of one for the Arduino, but very generally all caps refers to constants, so it's confusing to use that.

If you like, put a "g" in front, eg.

volatile byte gWDTpoolCount;