One of those days...

Today was supposed to be a good day, a day of a simple fix I had been meaning to apply, a day to relax afterward. Today though, has turned into the “day of hell” for me.

Many of you here know that I often expound on the need for backing up a system. Many of you know that employ an automated backup solution for the workstations in my home, all of which backup to a fileserver on-site.

I am about to tell you a tale of woe and bad luck.

You see, not too long ago I spent about 2 weeks or so going through my “archive o’ crap” (which is what I call a large collection of CDs and DVDs of data I have), copying it all to my fileserver. In the process, I began to think “you know, if the drive this data is on takes a dump, I am going to be up the proverbial creek”. In addition to the data I loaded, I also had a ton of other data, mainly MP3s and some videos.

I thought “no problem” - I’ll buy an external USB drive to act as my backup for the fileserver, set it up, automate that backup, and all will be golden for the day when it was needed.

My fileserver is a custom FreeNAS install; its had been a while since it was last updated, but overall it worked fine. I served everything over samba (SMB), but was thinking about switching over to NFS at some point since I didn’t have any Windows boxes on the network that needed samba. After reviewing some online documents, I figured that once I had my drive plugged it, it would be a simple matter of mounting it, formatting it, then setting up a “local rsync” to that drive, running once or twice a day.

Last night I finally got around to putting that drive together; I hooked it up to my Ubuntu workstation, and it seemed like it was recognized alright, so I went to bed anticipating what today would bring.

Today brought hell.

I plugged in the drive, and it was recognized by FreeNAS, but almost immediately I began to notice strange things. I won’t go into much detail, as this post is already way to long, so let’s just say this:

Hard drives suck, and they’ll take a dump on you -exactly- when you don’t want them to.

Long story short, I ended up losing about 250 gig of data on my “media” drive; it uses the UFS partition, and trying to mount it is weird; rather than a mount point in /mnt (which looks and acts like a directory), I get a 210 byte file that says something strange in it - and I don’t know why. Nothing I have done has let me see the data on that drive since.

The other drive, which held my backups, was throwing some SMART errors, so it looked like it was on its way out. I ended up buying a new drive (and a PCI SATA adaptor, because this machine is a wee bit old), after many hours of trying this and that, and coming up empty. Still, I was able to mount that old drive, and I am copying the data off of it over to the new drive right now, and I am hoping to beat the reaper.

Here’s the thing - the crazy conundrum I am in, especially after preaching the gospel for so long - I needed a backup for my backup solution; I thought the USB drive would work great (I think it will, actually, maybe?) - but my system died just as I was implementing it. So, maybe I should’ve implemented it when I built the system? However, do I now need a backup drive for that backup drive? Ad Infintum? Turtles all the way down?

I can’t afford a tape backup solution for 1TB - who can? Why is it that the backup systems for the size of drives we have in our machines cost far more than what the machine and drives cost? What is a real solution?

Also - what do ordinary people do? Do they just lose all of their memories and data and such and “oh well”?

This is so frustrating; I know I am venting and ranting, and you shoulda seen me earlier (it wasn’t pretty!) - I just want to know what a real solution is? Who’s to say that had I actually implemented that backup for my backup earlier, that it wouldn’t have died as well? I’d go with a RAID’ed system, but I honestly don’t have the money to put into something like that, and that still wouldn’t fix the ultimate problem.

Something else I wonder about - are these drives supposed to be running as hot as they are? I don’t know the actual temperatures off-hand, but the old drives that were in my box were running fairly hot, and the USB drive (a 1TB 3.5 inch SATA drive in an enclosure) also would get quite warm. The new drive I just installed - let’s see - well, its not as hot. I am just wondering if manufacturers -expect- you to put fans on hard drives nowadays (and if so, why aren’t they included, damnit?).

So frustrated, so angry, so upset, so much time lost that I could use doing other things…

:’( :stuck_out_tongue:

Today was supposed to be a good day, a day of a simple fix I had been meaning to apply, a day to relax afterward. Today though, has turned into the “day of hell” for me.

Many of you here know that I often expound on the need for backing up a system. Many of you know that employ an automated backup solution for the workstations in my home, all of which backup to a fileserver on-site.

I am about to tell you a tale of woe and bad luck.

oh boy :-?

Today brought hell.
Long story short, I ended up losing about 250 gig of data on my “media” drive

! that blows

I can’t afford a tape backup solution for 1TB - who can? Why is it that the backup systems for the size of drives we have in our machines cost far more than what the machine and drives cost? What is a real solution?

bluray offers a solution, its coming down in price, ie here is a lg br burner for less than 150$, this Christmas I bet you they will be 79$ maybe lower

that nets you 50 GB per disk in dual layer (expensive as crap media) or 25 with a WORM disk (write once read many), those go for around 50 bucks for 15, which gets you ~375GB

its not cost effective yet, but this is its year for computer storage IMO

Also - what do ordinary people do? Do they just lose all of their memories and data and such and “oh well”?

if my father in law is any proof, then yes

Something else I wonder about - are these drives supposed to be running as hot as they are?

NO!

drives are traditionally free air, the idea behind the enclosures is that the enclosure acts a a heatsink, but fails

the darn things run hot in your pc with a fan blowing on them, putting them in a air isolated pocket of metal is one of my pet peeves, the best success I had with a enclosure was when I filled it with thermal epoxy, to bad that’s a 40 gigger (be mindful of the breather hole lol)

So frustrated, so angry, so upset, so much time lost that I could use doing other things…

It will get better :wink:

This is one of the reasons I don’t have a back-up solution.
I do make some backups, anything I can’t afford to lose, gets put on a DVD, a USB stick / external USB HD and on my webhost (off-site).
Mind you, I’ve learned to keep the amount of stuff I need to keep around to a minimum, murphy’s law applies always…

When you can’t afford for something to go wrong, it will go wrong.
If it can be even worse to go wrong at a later time, it will go wrong at that later time.

The frustration with failing backup systems and such is something I’m avoiding =P (I’m of the type that realises the backup system becomes more important than the workstation, except you have a centralised point that can fail instead of one of the X workstations)
Keeping it as simple as possible reduces the amount of things that can go wrong!

We had a problem at a company I used to work for with drives overheating. Some things I found that help the problem…

  1. Space the drives apart allowing for airflow.
  2. Add extra fans to the case.
  3. Make sure cables aren’t blocking airflow.
  4. Make sure you use all 4 screws to mount the drive. (most people are too lazy to or just don’t think it is necessary) The most import thing this accomplishes is allowing the case to act as a heatsink. Also helps reduce any vibrations that could cause damage.

In the end we just got 5 1/4" drive bay enclosures that had 2 built in fans. Something like this… http://www.newegg.com/Product/Product.aspx?Item=N82E16835119062

Do you also check for data integrity?

I started doing that for my private backups as well when I got into the ‘backup business’ in a former job.

Just recently I wanted to copy data from an old Raid-5 array to my usb disk and the thing failed on me. It had not been used for 2 years (the raid box) and 2 disks failed… or at least the raid controller didn’t like them anymore.

Now I did make a copy of that data 2 years ago onto another usb disk and immediately ran an MD5 check on it and the data was still fine. Now the big question is this:

“Was the data valid, when the MD5 sums were created?”

My current backup strategy:

  • internet server runs daily backups using dirvish + md5sums
  • an old backup is stored at my parent’s - off-site backup :wink:
  • a copy of that one is on a 2nd usb disk at my place
  • at the office I use GIT for everything important
  • my office data is synced with my internet server and my home pc

So at the end of a working day 3 disks in 3 different machines in 3 different locations would have to fail to give me a serious headache. GIT effectively saves me from propagating stupid mistakes into my backups.

You can’t be paranoid enough with digital data. I’m thinking about putting one disk into a bank vault - on 2nd thought make that two disks of different brands and different usb adapters.

Maybe I’ve posted this one before, but ‘The Tao of Backup’ is always a good read.

The joy of the impermanence of digital data in the real world. ;D

Tape backup? Seriously, even when buying the most expensive equipment and tapes, I always had it fail when it was most needed and ended up using a secondary backup system to do restores.

DVDs are untrustworthy unless you do a less than maximum speed write to allow the laser time to burn the bits and then do a data verify after the disk is written.

Which leaves you to using some sort of USB hard disk method to do backups, preferrably two separate ones so you don’t get bitten by hardware failure of either device (once again bloating expense and time required).

And have yet another system where you can do periodic data restoration and comparison to make sure your backup method A) works and B) is capable of restoring the data to a new system. Nothing stunk so much as having the need to restore an old tape and finding that in the duration, head wear and tolerance meant the tracks were displaced so the tape was unreadable, despite the fact you could read one done last week. One particularly painful restore required about 5 retensions to get one good read off the tape, after which it failed completely to be readable. DAT cured that problem, only to bring on a new set of failure modes.

I’ve come to believe in data spew, throw it in as many locations as possible and rebuild the core storage from the distribution when the system failure occurs. If something happens and takes the whole building out, you’re not much going to care anyway.

And also learn to recognize garbage data needs to just be bitbucketed. If it isn’t there, it ceases to be a concern.

If it isn’t there, it ceases to be a concern.

Except for the mental pain… every lost bit is like a hot needle poked into a finger. It is about time we get ZFS like data integrity checks built right into file systems. I’m really surprised I can still read and view some of the data I created just 10 years ago, only to find out that - although it is valid - it’s utter garbage.

I have already experienced the same thing, a pair of hard drives dying in the same day, but my backup scheme was just ethernet cable, and drag and drop to the other computer.

So, is someone willing to teach or show me how to use GIT(never used that) ou google code to save some important work in a smarter way than going to the google site and make upload? :-[

Also - what do ordinary people do? Do they just lose all of their memories and data and such and “oh well”?

I am using mozy for my important stuff, I can always redownload the music and movies from the online stores again later, so skipping them :slight_smile:

But loosing a drive always suck anyways… Have lost almost 1TB of stuff before. Big drives is just like a keychain, it is a device that allows you to loose all your keys at the same time…

I turned up to work one day to find all our computers and equipment missing everything except for one very lonely mouse, needless to say everything is now backed up off site. It was a very painful lesson to learn.

@kierin

Yep, offsite backup storage is very important. A bank safety deposit box for long term, home lockbox for the transient day-to-day with exchange to the safety deposit box the next day kind of takes care of that. As well as internet storage of your choice if you trust them…

Epilogue: I managed to recover all of my data!

:slight_smile: :slight_smile: :slight_smile: :slight_smile:

I’m also rethinking my backup strategy; I think I am going to go for a distributed backup on each workstation/server to an external hard drive to eliminate single-point-of-failure - I might also continue to keep the backup to the server as well, if I can rig the workstations to do the dual-backup scheme.

Where did my data go? Well, I think I actually caused the worst of it on the one drive that held all my media; I think at some point I performed an fsck on the web interface, and it was taking a while and I didn’t know if anything was happening - in my haste, worry, fear, confusion, doubt, anger, etc - I probably reset the system, leaving the file system on that drive in a very weird state. Well, for S&G, I decided to run a full fsck (on both drive), be patient, and wait. At the console, a combo of ps and top proved that fsck was doing work, so I just let it sit. When it came back, all my files were sitting in lost+found. The other drive where the backups were stored didn’t have any issues (fsck found no problems).

Since the drive that my backups were going to had a SMART failure on boot, I decided the other drive (where the media is stored) couldn’t be too far behind. I had purchases a second set of 400 GB drives, with the intention of building a second server, but I never got around to it (which was a good thing). I copied all the data to the second pair of drives, installed them, and all was good.

In addition, I added heat-sink/fan combos to each drive, plus an extra case ventilation fan (a largish PCI slot fan). While testing fan output and drive temperature, I noticed that the PSU fan was not turning fast; I ended up pulling it out and re-lubing it - I’ll probably end up replacing it in the near future (might replace the entire PSU). After that, the temperature of the drives seemed better than before; they’re running now at about 37C, before (the old drives) they were around 45C.

I was pretty worried there; ultimately a night’s sleep and some food made the next day clearer, and allowed me to see some mistakes I had made previously. One of the interesting ones were setting some jumpers on the new pair of drives - I had bought these drives fairly cheap from computergeeks.com ($40.00 for a 400 GB PATA drive), but they kept showing up as a 32 GB drive when I installed them. It turns out that there were two different settings for the jumpers, one set for “full drive capacity”, and a separate set for “32 GB cutoff”; in my haste and worry, I had selected the wrong set (and thought that I had been ripped off!). I don’t know what the “cutoff” set is for, but once I had some sleep and looked it over in the morning, I saw my mistake. I ended up taking back the Serial ATA interface and drive I had bought (saving me $100.00).

I am soooo glad that I won’t have to re-upload all of that data back onto that media drive. In the meantime it was an interesting lesson all the way around, in a lot of areas that I hadn’t thought about. The biggest two would have to be run fsck more often, and to keep an eye on the SMART status messages, neither of which I was doing as well as I should have.

Finally, I still recommend FreeNAS as a great way to build a NAS box on the cheap.

:slight_smile:

I’ve lost important (well, sentimentally important) data far too often.

A few years ago I was running the family business (a small copy shop, also selling stationery and doing light PC repair) and had all my personal data on a 60Gb drive. One morning it failed - you can always tell the sound - so I dug around and found another 60Gb to rescue it on to.
The PC spent the rest of the day chugging away at the failed drive and got almost all of my data back.

The next day the drive I’d rescued the data onto failed. I dug around and I had another 60Gb drive, so I did the same again.

On the third day, the third drive failed. I yanked it out of the PC and threw it in a pile with the other two - only then did I notice they were identical IBM drives. On closer inspection, they had consecutive serial numbers.

I don’t really back up any more. I try keep my data on my always-on NAS box (I’ve learny my lesson there, if it doesn’t power down don’t turn it off at the wall), and if it’s important I’ll leave it on my PC/laptop too.

To answer your question, cr0sh, cloud storage is the way to go. It should be backed up at least daily, and if there’s any hardware failure you shouldn’t ever notice.
Microsoft provide 25Gb attached to a hotmail account (called Skydrive) - though I believe it’s limited to 50MB max filesize - for free, and there are commercial solutions.

Edit: USB drives, in my experience, have a higher failure rate than internal drives. I think it’s something about cheap caddies not powering them down properly.

FreeNAS is good but you have to remember the extra ongoing cost of running a whole machine vs. the initial cost of buying a low power ethernet drive caddy.

I have a QNAP NAS box and it’s amazing. I have a 1TB drive in it, and it does bittorrent, newsgroups, webserver, ftp server all in a box not all that much bigger than a hard drive and with a power consuption of about 15W. My only trouble with it is that 1TB doesn’t take long to fill when you’ve got a 50mbps connection :smiley:

Hi there

Is a drobo to expencive for you?

$ 399.00 for a standard version (4 bay, 2’th gen) as long as you provide the HD’s.

Internal redundency of all data. You need to trash minumin 2 HD’s at the same to loose data.

-Fletcher

You need to trash minumin 2 HD’s at the same to loose data

Easily done. Drive failure is one thing, fire, flood, etc is quite another.

Also, I suspect cr0sh’s freeNAS is as capable as that drobo.