Archives

Date

The dark ages of lost data are over

For much of history, we’ve used removable media for backup. We’ve used tapes of various types, floppy disks, disk cartridges, and burnable optical disks. We take the removable media and keep a copy offsite if we’re good, but otherwise they sit for a few decades until they can’t be read, either because they degraded or we can’t find a reader for the medium any more.

But I now declare this era over. Disk drives are so cheap — 25 cents/gb and falling, that it no longer makes sense to do backups to anything but hard disks. We may use external USB drives that are removable, but at this point our backups are not offline, they are online. Thanks to the internet, I even do offsite backup to live storage. I sync up over the internet at night, and if I get too many changes (like after an OS install, or a new crop of photos) I write the changes to a removable hard disk and carry it over to the offsite hard disk.

Of course, these hard drives will fail, perhaps even faster than CD-roms or floppies. But the key factor is that the storage is online rather than offline, and each new disk is 2 to 3 times larger than the one it replaced. What this means is that as we change out our disks, we just copy our old online archives to our new online disk. By constantly moving the data to newer and newer media — and storing it redundantly with online, offsite backup, the data are protected from the death that removable media eventually suffer. So long as disks keep getting bigger and cheaper, we won’t lose anything, except by beng lazy. And soon, our systems will get more automated at this, so it’s hard to set up a computer that isn’t backed up online and remotely. We may still lose things because we lose encryption keys, but it won’t be for media.

Thus, oddly, the period of the latter part of the 20th century will be a sort of “dark ages” to future data archaeologists. Those disks will be lost. The media may be around, but you will have to do a lot of work to recover them — manual work. However, data from the early 21st onward will be there unless it was actively deleted or encrypted.

Of course this has good and bad consequences. Good for historians. Perhaps not so good for privacy.