Power failures require reboot to access files again

I am tracking data with a Yun in a file on an SD card. It seems that during power outages or if I just unplug the Yun, the next time I power up it can't seem to write the the files anymore. This doesn't necessarily happen all the time.

If I try and delete the files and start over again I can't delete them because I don't have the correct privileges even though I"m logged in as admin under root.

If I reboot, then everything is OK. This is frustrating because I really want to system to start back up again after a power failure without any intervention on my part.

Any solutions or suggestions?

I'm actually using the file print command from the arduino script to right to the file. Maybe the script was killed from the power outage before it had a chance to close the file? Is there any way to close the files automatically on startup without doing a reboot. This is just a theory on what may be happening.

@ottsm
this depends on what you mean by "Power failures". Yun OS (Linux) is not a fault tolerant system.
Are you waiting for the full and complete reboot before you restart your system?

Jesse

The only issue I have is that when the power goes out (or I unplug the Yun power supply) and then power is restored, the files that the were being written too no longer be written too.

I'm writing the a CSV file from the Arduino script. If I look at the file date it never changes. The file cannot be deleted manually either, I get privilege errors.

But if after all this I send a Reboot command, after the reboot everything is OK, the files can be written too again. If I want to delete the file and start the data over I can.

I'm not looking for something that is "fault tolerant", just something that can recover automatically after a power fault.

The only issue I have is that when the power goes out (or I unplug the Yun power supply) and then power is restored, the files that the were being written too no longer be written too.

(...)

Does this mean you are pluging the plug while it is running?
If yes, don't do that.

I'm not looking for something that is "fault tolerant", just something that can recover automatically after a power fault.

You have in essence create two (2) sentences that contradict each other.

A "power fault" is a fault condition of a "fault tolerant" systems.

Now, I'm not trying to pick on you - I'm trying to get you to be specific about your needs.

NOTE: Many new BSD/Linux/Unix systems use Journaling file system to reduce the harm from fault conditions. There other procedures, but this system does not have spinning platters - so you have to be proactive to get really close to "fault tolerant" without incurring excessive financial costs.

So please, be specific about your needs.

Jesse

I'm keeping track of the temperature of over 30 beehives. The Yun is mounted out in a field and the power is sometimes unreliable. It writes the temperature data to an RRD file and a CSV file. If the power comes and goes it will basically need to be checked from time to time just to validate that the system recovered after power was restored.

The need is that when the power comes back on the program is able to restart. Currently it does not reliably run because the files that the program is trying to access have been locked out by the operating system. Even logging in as admin and trying to manually delete the files wont work. The only thing that I found that will work is to send a reboot command. The script running in the arduino code needs to have access again to the files that it was writing at before the power went off.

I know that data during the power outage was lost forever. But If I don't keep checking the system from time to time I could have lost data for a week even though the power may have been off for only a minute.

I suppose one way to solve the problem is to install a battery backup for the system. But I would have thought during a power up everything would be recover without sending another reboot to restart the system.

ottsm:
I'm keeping track of the temperature of over 30 beehives. The Yun is mounted out in a field and the power is sometimes unreliable. It writes the temperature data to an RRD file and a CSV file. If the power comes and goes it will basically need to be checked from time to time just to validate that the system recovered after power was restored.

The need is that when the power comes back on the program is able to restart. Currently it does not reliably run because the files that the program is trying to access have been locked out by the operating system. Even logging in as admin and trying to manually delete the files wont work. The only thing that I found that will work is to send a reboot command. The script running in the arduino code needs to have access again to the files that it was writing at before the power went off.

I know that data during the power outage was lost forever. But If I don't keep checking the system from time to time I could have lost data for a week even though the power may have been off for only a minute.

I suppose one way to solve the problem is to install a battery backup for the system. But I would have thought during a power up everything would be recover without sending another reboot to restart the system.

@ottsm,
you are trying to build a fault tolerant system. Your previous description and this one makes me believe that you think that "fault proof" and "fault tolerant" are one and the same.

"fault proof" is not possible - as you already know.

"fault tolerant" has levels of tolerance.

Wikipedia

Fault-tolerant computer systems are systems designed around the concepts of fault tolerance. In essence, they must be able to continue working to a level of satisfaction in the presence of faults.

As for the file system, your best options are solar panels with Capacitive power source. You will need about 6-8 seconds to shutdown the system. You will also need a watchdog to have the system shutdown. And for good measure you will need a Journaling file system.

The locked file you are encountering is from the "dirty" shutdown. The persistent media (SD) holds the file handle state. When you reboot from a "dirty" shutdown, the system the reads the "stale" state. Then when you do a "clean" shutdown, the OS closes the file - even if you have left it open.

The Journaling system will help with this, but unless you reset the parameters to the file system, you could lose the entire dataset - and likely several days work. There are many software work arounds - such as closing the file (or creating separate files) on a periodic basis (hourly, daily, weekly, etc.)

Okay. Beyond the loss of data, when the power is off, what other situations are tolerable?

Jesse

Thanks, now I see what you are getting at. I could see an option where the program could try and access the file and read any fault codes back and then send a reboot command if the file cannot be accessed. Something tells me this is probably a bad idea. Probably get caught in some endless loop.

Of course the entire file could end up corrupt and all the data would be lost without doing what you said and create daily files or hourly (whatever time span a person could live with or without).

Other option is to improve the wireless link and make it more reliable and just have the Yun push the data up to another server that's at the house.

ottsm:
(....)

Other option is to improve the wireless link and make it more reliable and just have the Yun push the data up to another server that's at the house.

@ottsm,
In which case, your just shifting the issue to another system.

Are you familiar with a watchdog system?

Jesse

I am familiar with a watchdog timer in PLC programming, however this is done in the background. My Linux skills are somewhat limited but I understand the concept.

Is this what you are getting at?

#include <avr/wdt.h>

.
.
.

void reboot() {
wdt_disable();
wdt_enable(WDTO_15MS);
while (1) {}
}

I ended up just detecting if the file was in error and sent a run shell command to reboot. Seems to work for now.

ottsm:
I ended up just detecting if the file was in error and sent a run shell command to reboot. Seems to work for now.

@ottsm,
sorry for the late reply. My laptop died over a week ago and the repair time has been extended.

On the watchdog system, it is already built into OpenWrt and by extension with Yun OS.
However, it is NOT turned on.

Here are my notes on the system.

NOTES for Watchdog

Let me know if this helps.
Jesse