Sunday, October 10, 2010

Automatic Consistency for Disk Storage

I was going over my grad school works and I stumbled upon some piece of research that I did early in my grad school that never saw the light.  I think it's fair to say that this is one of the most complex prototypes I built while in grad school, but ironically never got to publish it for lack of a compelling motivation of the idea.  It's part of my dissertation though, except that it's buried somewhere deep into the 125 pages.  I thought it fit to give it at least some credit by putting up a blog post about it.

The work makes a case for ensuring semantic consistency of data at the disk-level, using the additional knowledge of block pointers as proposed by Type-Safe Disks (OSDI '06).   Preserving data consistency in the event of unexpected system crashes is known to be one of the key challenges in disk storage.  In many cases, disk-data becomes completely unusable unless it conforms to  certain software-specific invariants that define it's consistency.   For example, an on-disk B-Tree with dangling pointers in some of the nodes, cannot be used locate data items.

Today's consistency mechanisms operate at the software-level making disks totally oblivious to the consistent state of the data.  Knowledge of consistency at the disk level enables interesting functionality which cannot be provided by traditional disks.  We built a new disk system that we call an ACE-Disk (Automatic Consistency Enforcing Disk), a disk that preserves the semantic consistency of stored data. In our approach, the disk system takes responsibility for consistency management, and thus is empowered to provide consistency-aware functionality such as data snapshotting.  Applications simply inform the disk about relationship between various blocks that the application already knows about.

Here is a link to the full manuscript of the work for those interested.