lossless data archival, optical disk, RAID, what is best these days?

i am just curious, as i am thinking of setting up a backup server. mainly for rarely used MP3s, Arduino code, and if i actually manage to take some truly amazing astro, or nature/landscape pictures. but before i went into it, i was wondering what people use these days for lossless archival. in the past, i used recordable CDs, and just burned 2 or 3 extra copies for redundancy.

i don't like "cloud" storage, as i don't like somebody else controlling my data. perhaps i am just paranoid :o

ok, so i have like 8.78GB of music, and i only listen to perhaps 50 songs. but i don't want to delete the ones i don't listen to anymore.
I guess Github will work for my Arduino code. or i can eventually setup my own GitLab server at some point. but my music, i would really like to archive that, lossless, with redundancy.

so, the backup server i have in mind, has no hard-disks yet, but it has space for up to 6 SATA 3.5" drives. one for the OS, and 5 for archival.

my server board can only handle RAID 0, RAID 1, and RAID 10 (combo of 1 and 0, not ten). or would i be better off to get a RAID card with a different mode?

the really old stuff, i could just burn to a CD or DVD, and file it away. so the backup server is mainly for random-access archival.

or would i be better to just burn stuff to a DVD and catalog it so i know where to find it?

~Travis

I've found that recordable disks don't hold up very well over the long term. I think offline magnetic hard drives have more longevity. They are also much easier to deal with and faster when you have enough data to back up which would not fit on a single disk.

RAID really isn't a backup strategy, it's more of a solution for avoiding downtime caused by hard drive crashes. If you want to preserve data you need to have it offline and off-site. Otherwise a fire, natural disaster, virus, or power surge can take out all your drives at once regardless of how many online backup copies you have.

travis_farmer:
i don't like "cloud" storage, as i don't like somebody else controlling my data. perhaps i am just paranoid :o

Then you are not alone in being paranoid.

Now that the subject has been raised, how do USB sticks compare with off-line hard drives for memory longevity?

Of course, another factor for long term storage is the availability of the required play back mechanism. Do you still have a VHS player? I don't. What about a player for the little video tapes from video cameras?

...R

none of the data you are storing are sensitive, so why not the cloud?

pert:
I've found that recordable disks don't hold up very well over the long term.

That is exactly the opposite of my experience. Verbatim even has a no-time-limit warranty on their optical media. I do know, from testing, that ultraviolet exposure damages both written and unwritten discs. Have you been sunbathing with your backups?

Verbatim even has a no-time-limit warranty on their optical media.

Sweet, so when your data is lost you can recover the $0.50 you paid for the disk it was on!

pert:
Sweet, so when your data is lost you can recover the $0.50 you paid for the disk it was on!

That's one perspective.

Another is, if the discs had even a modest failure rate Verbatim would be out of the optical disc business. They would be returning the retail price which is significantly higher than the wholesale price.

In any case, we've burned hundreds of Verbatim discs and have had zero failures (other than the ones intentionally damaged from testing).

Humans are lazy. We like the warm fuzzy feeling we get when buying a product with a lifetime guarantee but very few ever actually go through the trouble to hold the companies to it. I have found that if you actually do contact a company with an issue they're very good about dealing with it but I've also slacked on quite a few things I could have returned. Actually that reminds me I have some bad refurbished printer cartridges I'm supposed to send back for a refund. I even go to the post office daily with packages and have all the shipping supplies laid out and I still keep putting it off.

Coding Badly, surely you are not that naive? There are many products that offer a full warranted that are complete pos.

AvE

travis_farmer:
i am just curious...

What we do for things that change frequently (like source code)...

• Copy the contents (e.g. git repositories) to a TrueCrypt volume

• The TrueCrypt volume is sized to precisely fit a single layer DVD

• Burn the TrueCrypt volume to DVD

• Upload the TrueCrypt volume to a website

• When convenient, the DVDs are moved to a safe deposit box

There is one "live" copy (e.g. git repository) stored on a harddrive. There is one encrypted copy stored on another harddrive. There is one encrypted copy on optical media which is eventually stored offsite. There is one encrypted copy stored in the cloud. The process is automated with Python (except the trip to the bank).

What we do for things that rarely change (like images)...

• Copy the contents to a TrueCrypt volume

• The TrueCrypt volume is sized to precisely fit optical media (single layer DVD; dual layer DVD; Blu-ray disc)

• Burn the TrueCrypt volume to optical media

• Upload the TrueCrypt volume to a website

• When convenient, the optical media is moved to a safe deposit box

Retrieving content is trivial. Get the disc. Load the disc. Mount the TrueCrypt volume. Have at it.

If a disc fails or the bank is closed, use the TrueCrypt volume stored on a harddrive.

If that fails or the harddrive copy has been deleted, download the TrueCrypt volume from the cloud.

a loss is no big deal to them

except they are in the business of making people feel comfortable and secure in uploading data that actually matters, not simply easily replaceable MP3 files and pictures of obscure places in maine. I'm pretty sure if AWS had issues with data loss no one would use it. Most cloud backup services can recover your data for you after you delete it off their server.....

Off-line the best way to back up your stuff long term is to create several different back ups on different forms of media and then put them in safes at two different locations.

Qdeathstar:
Most cloud backup services can recover your data for you after you delete it off their server.....

I wonder why I find that very worrying ...

...R

^mix of paranoia and narcissism. :slight_smile: It's normal though.

Another factor that is worth considering is the difference between corporate data and personal data. I can't imagine anyone being concerned if all my data is lost after I am dead. But a corporation (or even a club) will probably want its data to be available far beyond the "membership" period of any single employee or member.

I suspect very few small businesses or clubs take sufficient cognizance of long term retention and accessibility of data.

And something that I find very frustrating is the inability of a computer program (such as a backup program) to {a} distinguish between useless info that does not need to be retained and {b} to recognize that I have just moved a file to a different folder and there is no need to make another backup copy of it - without me having to take specific steps to let it know. Yes of course i could organize the useless info into a separate folder and exclude that from the backup process - but there are many situations where it is useful to have all the bits in one folder. [/rant]

...R

GitHub is an excellent host for Git repositories but I don't rely on them to preserve my data, that's not the reason I use their service. I have a backup strategy for that and my repositories are backed up along with all my other files even if they are on GitHub.

I do think it's probably unlikely GitHub will lose any data but it certainly has happened to others in the past. In fact, speaking of GitLab, they had a colossal screwup and actually did permanently lose some user data:

If you read the postmortem you can see that reasonable precautions to prevent this sort of thing were not taken. I do think it's admirable how transparent they were about the whole thing.

@Travis, don't be insulted. My point is that these companies sole reason for existing is to prevent data loss. That's why they exist. So obviously data loss would be a very big deal to them, since it would undercut their reason for existing. Also, it's not like you are storing state secrets.

You install/build cabinets. They store data. Your installed cabinets shoddy and they wound up on the floor. They didn't properly maintain there data server and you lost your data. Do you think you would be concerned or care about your customers cabinets or no?

Qdeathstar:
My point is that these companies sole reason for existing is to prevent data loss. That's why they exist.

I think they actually exist to make a profit and have figured that offering data storage is a way to do it. How long do you think they would care about your data if they start losing money? One day, an apparently flourishing company. The next day the receivers are called in. And where is the company registered? How would you make a claim for loss against the receivers? What legal system would apply?

Vapourware seems like the best description. :slight_smile:

And, to link back to my comment in Reply #13, if I can't delete my own data permanently then who really controls the data?

And how can we know whether some mafia types really control the data storage business just so they can browse your data at their convenience?

By all means use cloud storage for data that you want to share publicly.

...R

Their desire to make a profit is what will ensure your data isn't lossed.

Qdeathstar:
Their desire to make a profit is what will ensure your data isn't lossed.

True. But only up to the point where they stop making a profit. Then all bets are off.

Maybe you should pay them more just to stave off their bankruptcy. But how do you know that the partners are not squandering your money on huge sail boats?

I can't immediately think of any older (like 50 or more years in existence) activity that is equivalent to trusting ones data to someone invisible. You don't know who the directors of the company are. You don't know where the company is registered. You don't know where the data servers are. Back in the day these were referred to as snake-oil salesmen.

Can you buy an insurance policy from any insurance company regulated in the UK, the EU or the USA which would promise you a payment equal to the value to you of your data if the data is lost by the cloud company within the next 25 years? (or even within the next 2 years?)

...R

It's your responsibility to use a reputable company and keep track of how the company is doing... Amazon AWS isn't going anywhere for at least the next five years.... Microsoft Cloud is also safe...