They run a relatively simple program - no self-modification , no flash writing.
When the devices power up, they flash rapidly ~6 times (commanded by the bootloader?)
then 12sec later, my code starts, by flashing two times to show it's ready.
Randomly, over time, on two occasions one device failed to work (no flashes after 12 sec.), arduino bootloader is fine - all I needed to do is to reflash the code, and everything worked.
Does you code write to the eeprom? Could it be that some bad stored data kill the sketch, and the chip erase when pregaming it fixes it by staying the eeprom?
You can read out the flash with avrdude and see whether the flash is being modified (I predict it's unchanged)
Hi, as I wrote, no writing in the program. nothing fancy.
Yes, i would like to read it out, as there is no ICP connector, I hoped to read out using the FTDI cable, and command:
avrdude -C /home/andre/Arduino/arduino-1.6.4/hardware/tools/avr/etc/avrdude.conf -v -v -v -v -p atmega8 -c stk500 -U flash:r:"./read8.bin":r -P /dev/ttyUSB1 -b57600
avrdude: Version 6.1, compiled on Oct 24 2014 at 10:33:03
Copyright (c) 2000-2005 Brian Dean, http://www.bdmicro.com/
Copyright (c) 2007-2014 Joerg Wunsch
System wide configuration file is "/home/andre/Arduino/arduino-1.6.4/hardware/tools/avr/etc/avrdude.conf"
User configuration file is "/home/andre/.avrduderc"
User configuration file does not exist or is not a regular file, skipping
Using Port : /dev/ttyUSB1
Using Programmer : stk500
Overriding Baud Rate : 57600
avrdude: Send: . [1b] . [01] . [00] . [01] . [0e] . [01] . [14]
avrdude: ser_recv(): programmer is not responding
avrdude: stk500v2_ReceiveMessage(): timeout
It should be possible to use same commands that arduino uses for verify ,to read it out, and compare with a good one.
AndreK:
This is interesting, if you guess flash is unchanged, why would the application stop working - and start after a reflash ?
Why don't you get the two flash dumps, from working and non-working board that formerly had identical code on them, and then we'll SEE if the flash is the same or not.
If it's not, dump EEPROM and fuses (over ISP - or do it all over ISP), compare those.
Computers are deterministic, so find out what's different between the flash/eeprom on the working and non-working boards.
I will.
i just downloaded a fresh programmed one for test, need to remove the defective one from the plane tomorrow or monday, then compare. Must also check GIT to verify that what I compiled now, is the same that is used in plane, or generate same revision first.
I hope the compiled code did not change with Arduino version, but the old one were compiled with Arduino 1.5-something
AndreK:
I hope the compiled code did not change with Arduino version, but the old one were compiled with Arduino 1.5-something
will post results.
I think the compiler changed between latest and 1.5x....
Did you say there were multiple boards in use that were running the same code? If you have a working and non-working one you programmed at the same time....
Yes, the freshly compiled firmware is not very like the readout of the bad device. (compiler/version difference)
on monday, I'll read out from another plane that uses the same version.
(- when I wrote that I had 6 of those in use, that's 6 cheapduinos, but two a bit different firmwares based on payload they are interacting with - not sure exactly what kind/device failed last.)
Anyway, what i can clearly see (readout based on FTDI cable)
fom address 0x1C00 until the end, both have the same code, (bootloader) but
the "defective" device contains 0xFF in all bytes from address 0x00 to 0x13F (320bytes)
this is exactly 5x64 bytes, 64 bytes is the chunk size Arduino bootloader reads/writes in..
what could have caused this ? Even if I guess on a droplet of water in the FPC connector, it's hard (the rest of he device is covered) an random write command is very unlikely.
Noise on the reset-pin can sometimes do this. Solder a jumper from Reset to VCC and it can never happen again. Just remember to remove the jumper when uploading new firmware.
considering the placement of the device, (outer payload assembly) and the weather the day it happened (heavy rain) it's possible that a small droplet got blown into the device's FPC connector, the only part that is not covered in conformal coating and rubber.
So I think we've found the reason..
Thank you all.