Hands-On Testing and Analysis

Talking Data Protection – Here and at vBrownBag

Water Cooler Raid E1513047547352

Now that we’ve finished talking about RAID it seemed like a good time to both recap and announce that our friends at vBrownBag have asked me to turn this little series of blog posts into a month of vBrownbag webinars this January.

Why Talk RAID in 2017?

I was inspired, OK obsessed, to write this series returning to first principles after two events this summer. First I had a little disagreement with a vendor fanboi who was convinced that the unicorn tears spilled over every one of his employer’s appliances exempted them from the basic math governing single parity.

I then had a conversation with Datrium founder Hugo Patterson about erasure coding and the various causes of data loss in a storage system. Hugo and a couple of his fellow PhDs at Datrium wrote a paper N+1: Myth and Fact that reviewed much of the literature on the subject and came to the same general conclusion I have maintained for years, that the risk of an error of any kind when rebuilding after a disk failure was just too high with a single parity system for tier-1 workloads.

Craig Nunes, who runs marketing at Datrium sent me the paper and asked if I could explain the concepts and back up the math to people who, unlike Hugo and his co-authors don’t have advanced degrees. I promised to write a Technology Brief (I’m just about there, Craig). To make sure I got all the math(s) right I brought in our favorite mathematician Dr. Rachel Traylor together we did our own literature review and built an online data loss probability calculator I should have online in a week or two.

I’m busily working on the Technology Brief but at the same time I felt obsessed to get all the way back to basic principles

Our Series So Far

To date there have been eight installments in the All About Data Protection Series:

In Chapter One – Where Did RAID Come From we looked at the technological drivers that lead to the development of RAID and introduced the Patterson, Gibson, Katz paper that introduced the common RAID terminology.

The Second installment, How RAID Works Stripes and Mirrors, introduced the most basic RAID concepts the striping and mirroring of levels zero and one.

For Part 2¾ – A Few Words On Parity we stopped describing RAID levels to spend a little time explaining XOR and the math behind what we casually refer to as parity.

In Chapter Three – Parity RAID we discussed single parity RAID, levels 4 and 5, and admittedly glossed over RAID levels 2 and 3 since they’re pretty much obsolete with today’s drives that have LBA and large caches built in.

Chapter 4 – RAID6 – Double The Parity, Double The Fun introduces RAID6 a term we’re happy to apply to any double parity scheme that operates at the disk level.

For Part 5 – From Patterson to ProductsI reviewed the state of RAID in the early to mid-‘90s and more importantly introduced the LUN as a term of art for a logical disk.

In Part 6 – Nested and Combination RAID we looked at how storage architects combine RAID levels to get greater control over the capacity, performance, and resilience of their systems.

We return to the math of single and double parity systems forPart 6 ½ – Thinking About Parity and Read-Modify-Write  where we determine, and Dr. Rachel proves, that a single parity system can be optimized to 4 I/Os for a small write. Double-parity systems can perform the task in as few as 6 I/Os.

Data Protection Month at vBrownBag

My friends at vBrownBag run a terrific webinar series and they’ve asked me to turn this series of blog posts into four webinars in their weekly slot Wednesdays at 7:30 PM (19:30CST, 1:30 UCT) January 3, 10, 17 and 31st. Like the blog posts, the webinars will follow my professional path through storage history and dive deep into technologies, and the math(s) behind them.

vBrownBag is a volunteer operation, members of the community giving back to the community. They run regular webinars from the US, Australia, Europe and Latin America en Español covering topics across enterprise IT with a virtualization focus. For details see



Disclosure: Datruim is a client of DeepStorage for the calculator and technology brief mentioned above and member of the Better Benchmark Council but not a sponsor of this blog post.