Brad IdeasCrazy ideas, inventions, essays and links from Brad Templeton |
|
|
|
NavigationUser loginIf you like this blog, do me a favour and start your Amazon shopping (especially a kindle) from this link, and I'll get a cut. Recent comments
Top EssaysRecent blog posts
BlogrollFellow EFF Folks
Cory Doctorow Larry Lessig Ed Felten Dave Farber John Perry Barlow EFF Deep Links Dave Sifry |
I've been fortunate enough
I've been fortunate enough to not yet have any HD failures (knock on wood), so I don't know this: What's the typical failure mode for the newer, inexpensive drives? Is is really bit-failures, randomly strewn across the device? Block failures? Complete failures?
Do we really need a RAID solution to address it? (Well, RAID does provide an idealised solution.) Or would a continual (incremental) backup, in the background (e.g. from a journalling file system in the form perhaps of a log-structured file system), with "instant, on-demand restore" serve the bill? The latter, of course, offers a time lag on backup, data compression opportunities, and likely a momentary delay on restore.
Because such a system would buffer file churn, i/o performance should theoretically be levelled out over a longer interval as well, making such a solution more feasible over limited-bandwidth pipes. A bit fault can be corrected fairly quickly; a complete HD failure would take some time to recover, but would be recoverable.
For that matter, are there any popular Error-Correcting Codes that could supplement this? E.g. offshoring the ECC while retaining the source data locally, and retrieving (only) the redundancy data to correct the local data when a local error is detected?
Of course, if the nature of the typical failure (and thus the nature of the demand for data restoration) demands more immediate results, then a RAID solution would be the better approach.