Array data corruption after adding Timer Interrupt

Hi,

I have a program which runs reliably long term but has occasional lockups (possibly due to our sightly “iffy” mains supply. I have added a timer interrupt routine using TimerThree library and the program produces corruption of integer array data within minutes of running. I have looked for several days and introduced a debug routine after each function call to try and find the point at which the corruption occurs. This hasn’t helped me as the data at the end of the main loop() is ok and at the start of the next pass through loop() it is corrupted even though there aren’t calls to other functions.

I suspect my coding but the fact that the corruption appears to happen during the return to loop() start and the simple act of commenting the interrupt calls return the program to it’s normal state.

Attached is the top level code only and attached is some of the serial output I’ve been able to copy and paste from the output window (there must be a correct way!).

The code for the functions amounts to some 40-50kB and I thought it might be irrelevant at this stage. (Could save me some embarrassment :blush:

Thanks

SerialOutPut.txt (1022 Bytes)

SerialOutPut.txt (1022 Bytes)

MainControl-7.0.5-editing.ino (22.8 KB)

I thought it might be irrelevant at this stage.

But it is one of those functions that is corrupting some unnamed array...

Once more PaulS rides to the rescue :slight_smile:

Hi,

I didn't post because the forum posting system told me it was too big and In my ignorance I assumed that the functions couldn't be causing it because it happens only during the loop() flyback (always).

However, further ignorance stops me from effectively collecting the Serial.print output for displaying the debugging as I'm trying to use copy and paste.

Do I need to try and recode to a minimalist sketch?

Suggestions and constructive criticism welcomed.

Which board / processor? How much memory (SRAM) available for dynamic data after compile? I don't have all those libraries so can't compile to check. I suspect that you're either using too much memory or you have a buffer overflow.

You state that the problems started after adding the timer3 interrupt but don't post that routine? Please post all your code (zip it and attach as it seems quite big); the problem might very well be caused in a function that you did not post.

mega2560

Compiler says

Multiple libraries were found for “Ethernet.h”
Used: /home/david/Arduino/libraries/Ethernet
Not used: /opt/arduino-1.6.5/libraries/Ethernet
Not used: /home/david/Arduino/libraries/arduino_372926
Multiple libraries were found for “TimerThree.h”
Used: /home/david/Arduino/libraries/TimerThree
Not used: /home/david/Arduino/libraries/Timer3

Sketch uses 57,796 bytes (22%) of program storage space. Maximum is 253,952 bytes.
Global variables use 4,826 bytes (58%) of dynamic memory, leaving 3,366 bytes for local variables. Maximum is 8,192 bytes.

Watchdog ISR

void WatchdogReset()
{
  // Sink current to drain charge from C2
  pinMode(PULSE_PIN, OUTPUT);
  digitalWrite(PULSE_PIN, LOW);
  // Give enough time for C2 to discharge (should discharge in 50 ms)
  // delay(50);
  // Return to high impedance
  pinMode(PULSE_PIN, INPUT);
}

I will zip the code and try to attach later when I have checked any obvious stuff which can be removed.

Can someone explain why the problem shows up between end of loop and restart of loop? I’d love to know what happens as it goes around!

No zipfile :wink:

Any second now !!!

MainControl-7.0.5.zip (32 KB)

@Sterretje

I did attach the Zipped code but maybe I screwed up again so I’ve re-attached it.

MainControl-7.0.5.zip (32 KB)

It's a lot of files to go through :wink: Just started, but found one mistake in File2Status that can cause undefined behaviour.

If the opening of the file fails, you do not read the file (good) but you still close the file (bad). Move the close() to with the if block.

Digging a little more, Status2File does not even check if the file was opened successfully or not, another possible cause of undefined behaviour.

Suggestion: you might have been a little bit extreme in splitting the functionality over multiple files. Those above 2 functions actually belong together in a file. Call it e.g StatusFile and place both functions in there.

I might dig a bit more in the coming days and see if I can find obvious flaws.

If you decide to rewrite your code, try to start using cpp and h file instead of ino files. All your function prototypes and defines can go e.g. in a file called MainControl.h (you can actually now already do that).

void WatchdogReset()
{
  //unsigned long LastHeartbeat;
  //static long PreviousHeartbeat;

  // Sink current to drain charge from C2
  pinMode(PULSE_PIN, OUTPUT);
  digitalWrite(PULSE_PIN, LOW);
  // Give enough time for C2 to discharge (should discharge in 50 ms)
  delay(50);
  // Return to high impedance
  pinMode(PULSE_PIN, INPUT);
  /* 
  Serial.print("Resetting Watchdog ");
  Serial.print(LastHeartbeat = millis());
  Serial.print(" millis.  Interval = ");
  Serial.println(LastHeartbeat - PreviousHeartbeat);
  PreviousHeartbeat = LastHeartbeat;
  */
}

Don't do delays in an ISR.

http://www.gammon.com.au/interrupts

@NickGammon @sterettje Thank you both for constructive points. There is a lot to go through and I do appreciate your help

Unfortunately there would be no point in the routine because the delay is to discharge the capacitor and reset a hardware (555 timer) watchdog, because I have failed totally to get the system watchdog to work. If there is a way to get the inbuilt watchdog running, a reference to it would be nice because I am led to understand that a bug in the boot loader is the problem.

@sterettje

I will certainly make the changes you recommend and removing the logging would reduce the code significantly, especially as it was to give me something to check on whether the code was working. As regards cpp coding, I'm having a hard enough time in the IDE, can I code with the most basic ANSI C as I do now?

DaveJJ:
If there is a way to get the inbuilt watchdog running, a reference to it would be nice because I am led to understand that a bug in the boot loader is the problem.

The recent bootloaders should be alright. Have you proved you have a watchdog problem with the bootloader?

As regards cpp coding, I'm having a hard enough time in the IDE, can I code with the most basic ANSI C as I do now?

What do you mean? The .ino file is (effectively) C++. The other files in your project can be .c or .cpp (and .h of course).

I have failed totally to get the system watchdog to work.

Please post a simple example that demonstrates this, and describe what goes wrong.

I'll give it a try Nick, but basically when the interrupt fires I end up with a vicious circle of reboots and I can't upload the sketch again to recover the board. This is despite have a reset and/or a disable as the first statement in the setup(). I end up having to upload a blank sketch (setup() and loop() with just a one line Serial.print so I check it is running) and upload whilst holding reset (requires quick reflexes to release reset).

I'll try and concoct a sketch to demonstrate.

Thanks

 //WRITE *******************************
  Wire.beginTransmission(EEPROM_ADDRESS);

  Wire.write(0x00);      //First Word Address
  Wire.write(0x00);      //Second Word Address

  // Write Light Event times (hour, minute, hour, minute, etc)
  for (byte index = 0; index < 20; index++) {
    Wire.write(TimeData[index]);
  }

  delay(10);
  Wire.endTransmission();

There’s no point in the delay there. Nothing happens until the Wire.endTransmission(); - you aren’t delaying anything except putting something into an internal buffer.


 Wire.requestFrom(EEPROM_ADDRESS, 2);
  delay(5);

No point there either. Wire.requestFrom doesn’t return until all the data that is going to be returned is ready.

http://www.gammon.com.au/i2c

You are using a Mega, right? I think my bootloader on this page does not have a problem with the watchdog timer.

(It's not really mine, it's just a recent version).

You could try the following:

#include <stdint.h>
    #include <avr/wdt.h>

uint8_t mcusr_mirror __attribute__ ((section (".noinit")));

void get_mcusr(void) \
  __attribute__((naked)) \
  __attribute__((section(".init3")));
void get_mcusr(void)
{
  mcusr_mirror = MCUSR;
  MCUSR = 0;
  wdt_disable();
}

http://www.atmel.com/webdoc/AVRLibcReferenceManual/group__avr__watchdog.html

Whandall:
You could try the following:

#include <stdint.h>

#include <avr/wdt.h>

uint8_t mcusr_mirror attribute ((section (".noinit")));

void get_mcusr(void)
  attribute((naked))
  attribute((section(".init3")));
void get_mcusr(void)
{
  mcusr_mirror = MCUSR;
  MCUSR = 0;
  wdt_disable();
}




http://www.atmel.com/webdoc/AVRLibcReferenceManual/group__avr__watchdog.html

I don’t know what it does but it has to be worth a try. Perhaps all will be clear when I read your reference

Thanks

You’re right, I have a Mega2560 albeit Chinese with the CH340G chip.

Now all I have to do is learn how to use another Arduino board (I have another 2560) to burn the bootloader. So much to do and so little time!!

BTW How can I know which bootloader is actually loaded?

DaveJJ:
I don't know what it does but it has to be worth a try.

It disables the watchdog at a very early stage of the boot.