I ran into an interesting problem last night after I came home from work. I booted my computer and walked away as usual. When I came back, vista was happily screwing up files on my 1TB data drive.
Of course, windows is the last thing you want verifying the integrity of your data on a hard drive. By this point the damage was already done. Windows had decided that a number of files no longer existed and prompt did something to make them disappear.
I checked the raid array -- it was fine. I ran a verify/rebuild on the data and it came back clean. The answer seemed to be windows screwing something up on the file allocation table.
After some wrangling with a backup set of data and vice-versa (a file compare program) I think we're OK. Think...though you can't exactly be sure. One nice thing about vice-versa is that it will calculate CRC data on your source and target files. This seems a reasonable way to determine if the files are the same.
This problem, of course, got me thinking about how to deal with long term data backup. I don't think I'm completely alone with this problem. For example, how do you make sure your 5 years worth of digital pictures, of which you have only digital copies, are really OK? How do you know that you haven't just backed up corrupted data? It makes me wonder how much data the average person will lose over the years to computer problems. Dingy Dan and silly Sally will be quite disappointed when they realize all their family vacation pictures are just a bunch of fucked up files.
Personally I have a box of floppy disks I haven't looked at for maybe 5 years. Besides the physical problems of long term magnetic storage, there are real problems -- I don't have a floppy drive anymore. I could get one, though. But if I wait 20 more years I might be out of luck. First of all, even if I could find a floppy drive, the interface might no longer exist to connect it to my computer. Second, and more insidious is that the physical medium might not last that long.
I have about 300G of data. This may be a lot for some, and not much for others. Most of it comprises music (from my CD collection, and entered with quite a lot of work on my part), pictures, and some video. I do generate backups of my data -- but they go on an external hard drive in a rotating set. By rotating set, I mean eventually old backup data gets deleted to make room for new. Since my backup drive isn't infinitely large, I need to delete old sets of data when making new sets, at some point.
I don't have any idea if I'm backing up corrupted data. Maybe some pictures/files/writings/projects are already lost. How would I know? It's not practical to verify the integrity by hand of 250G worth of data.
250G is about 50 DVD's worth of data, so it's not likely that I'll burn all that data to a set of DVD's. You can buy a 500G hard drive for 60 dollars, though. Mirror your data, and put the drive somewhere. Somewhere meaning probably in a safety deposit box at a bank.
If you do that a few times a year, maybe, maybe, maybe you'll get some decent backups in there that might survive the next 5-10 years. That is, if you can still plug the hard drive interface into some ancient SATA interface card you bought on Ebay for 5 bucks.
My high school writing assignments, all which were all backed up onto 5 1/4 inch apple 2e floppies, are probably a goner...even if I could still find the disks.