Above all, everyone thank you so much for replying so much extremely helpful and informative information to me. I'm just an undergraduate student in ECE, my knowledge and experience are very limited compared to any of you, but with your help there is nothing that can't be solved!
For the redundancy system itself, it is specified by th scope of this project. I'm actually already aware the fact that there weren't many people integrated redundancy system into their successful near space balloons, this may means redundancy system is not very necessary in an armature balloon system (people even launched and recovered iPhone or Android phones successfully lol). However, since the redundancy is part of our project scope, I just treat it as one of challenges we need to solve.
So far, I have two basic design concepts:
-
Partially mimicking the redundancy system on autopilot for aircrafts, which runs 3 controllers simultaneously. All the outputs from the 3 controllers are compared and the most different output will be ignored. For example, when the ambient temperature is -60 outside the balloon, Arduino A reads -60, B reads -58, C reads -10, the result from C will be ignored. But there are two problems, one is that I'm not sure if this can be done without microcontrollers (might be at least as complicated as Arduino itself); another one is that how can I make sure that the program flow on the 3 Arduinos are the same? For example, if Arduino A wants to save temperature data to the SD card, while B wants to save GPS coordinates?
-
All three Arduinos are connected to the avionic identically, but we runn one Arduino at a time, the Arduino is monitored by a dedicated watchdog circuit externally. When Arduino A can't respond to the WDT's pull fast enough, WDT will reset this Arduino. If reset still can't make it go faster, this Arduino will be turned off by turning off the MOSFET connected to Vin pin of it, and then turn on Arduino B, etc. There are two problems in this scenario. First, even if an Arduino can respond to WDT fast enough, does this necessarily mean that this Arduino is all good? Second, how to do the "turn off & turn on" switching using WDT circuit?
The following are some questions and ideas from reading the posts. My knowledge is merely undergrade level and I'm learning from you guys every day, so my questions may be stupid, please bear with me
To retrolefty:
The AVR chips do have such an internal WDT and may or may not be useful for what you are looking for. I kind of like external hardware WDT, where all the processors have to continously send a sanity pulse out and if external logic circuitry detects a processor not alive, just switches to the redundant processor.
I fully agree with you. Using internal WDT is pointless if the controller itself is not responding. But how can I "switch" once a controller is determined dead?
To mrmeval:
Standard microcontroller watchdog integrated circuits will read a pulse from the microcontroller and this resets the watchdog counter to zero. You can do that with a dedicated circuit or if you need some repeatability and precision I'd do it with a microcontroller
The point of having redundancy is because theoretically more complicated microcontrollers have higher failure rate than simple passive components such as logic gates (I think?). The bottom line is, the redundancy system cannot fail, otherwise the entire system will be messed up and the equipments will be gone forever.
You'd want an external watchdog on it to reboot it if needed. I'd be tempted to use an ATTINY as a watchdog rather than the available dedicated watchdog integrated circuits.
Isn't ATTINY for onboard debugging?
I've worked with the Dallas external watchdog integrated circuit. DS1232. I can't spit on it hard enough to kill the engineer that built this piece of *t. It has no consistency across temperature, it has no consistency between parts and if you get one batch to work the next batch you buy might not.
OMG man you almost saved my butt. I was looking right on this chip! Its descriptions are so attracting. But upon what you've said, I'm not going with it. Now I'm thinking about another chip from Maxim Mixed-signal and digital signal processing ICs | Analog Devices, what do you think?
To Richard Crowley:
My gut feel is that attempting to deploy redundant systems, plus implementing some kind of cross-check/voting mechanism will quite possibly make your system LESS reliable unless you are doing graduate-level research on those topics. In which case coming here for advice seems odd.
That's why the redundancy system needs to be as simple as possible. The redundancy system is required by the scope of the project which I can't change, although I personally agree that it is not seemingly necessary at all. I'm still an undergrade student with knowledge of a monkey on circuit designing and interfacing, but with your help, I'm growing fast
To Coding Badly:
My suspicion is the battery is much more vulnerable than the AVR processor.
That's right. In order to counter this, I have put thick layers of insulation around the batteries. The choice of batteries are Energizer L91, those Li primary cells suppose to remain an acceptable state of charge at low temperature. Also, phase change materials may be used to hold the temperature further.
To Groove:
Why three identically-programmed devices?
The reason for using three identically programmed controllers is because we need to switch from one controller to another (as in scenario 2), the system behavior can be constant.
To tkbyd:
After years of seeing discussions about fighting heat buildup in electronics, I was amused to read of people who are fighting a situation where there isn't enough heat! (I presume they are further frustrated by having heat buildup problems while the balloon is still near the ground?)
Hummm, that's a great point! We are in Winnipeg in Canada, a city can get to -40 C in the winter, and pretty much we are going to launch our balloon during February - the coldest month of a year. Is getting hot still an issue then? I'm trying my best to wrap around the avionics to make it loss heat slower in near space. But according to what you said, I need to "vent" the heat?
To AltairLabs:
Your reply is extremely informative to me. It is my honor to know anybody has actually launched balloon before! By saying "take control", I meant to turn the malfunctioning controller off and the back up controller on. Because the code in each controller is the same and it is purely conditional, it shouldn't matter if we switch from controller A to B, the system behavior should remain the same.
I think I need to read more to absorb your words better. I will reply to you more thoroughly once I have done more research on your points.
Once again, thank all of you who had left ideas to me. You guys are truly amazing!