How likely is a RAID array to actually lose data? Four numbers decide it — AFR, MTBF, URE and rebuild time — combined into MTTDL. Here's what each means and how RAID level changes the odds. See the URE part live in the RAID calculator.
Drive failure: AFR and MTBF
MTBF (mean time between failures) is the headline reliability figure on a drive datasheet — often 1.2–2.5 million hours. It sounds enormous, but it's a population statistic, not a promise for your drive. The more useful number is AFR (annualised failure rate): roughly the percentage of drives that fail per year. A 1.5M-hour MTBF works out to an AFR of about 0.58% (8,760 hours ÷ MTBF), though real-world AFR from large fleets is often higher, especially for ageing drives.
The practical point: in an array of many drives, the chance that *some* drive fails this year is much higher than for one drive — which is the whole reason RAID exists.
Read errors: URE
The second failure mode is the unrecoverable read error (URE) — a sector the drive can't read back. Rated about 1 per 10¹⁴ bits (consumer HDD) to 10¹⁶–10¹⁷ (SSD). UREs matter most during a rebuild, when a single-parity array (RAID 5 / RAIDZ1) has no redundancy left, so a URE means data loss. Our calculator computes the URE-during-rebuild probability for your config, and is RAID 5 dead? covers it in depth.
On large modern drives, URE-driven rebuild failure often dominates real-world data-loss risk — more than the idealised double-failure maths below.
Putting it together: MTTDL
MTTDL (mean time to data loss) combines failure rate and rebuild time to estimate how long, on average, until an array loses data. The classic models capture the intuition: single-parity MTTDL scales roughly as MTBF² ÷ (N × (N−1) × MTTR), and dual-parity adds another MTBF/MTTR factor — so dual parity (RAID 6) is orders of magnitude more durable than single parity, and a shorter rebuild time (MTTR) directly improves durability.
Treat MTTDL as a comparative model, not a guarantee: it assumes independent, exponentially-distributed failures and usually ignores UREs, correlated batch failures and operator error — all of which matter in practice. Use it to rank levels (6 ≫ 5, triple ≫ dual), not to promise a date.
What actually improves reliability
Three levers move the needle: more parity (RAID 6/RAIDZ2 over RAID 5/RAIDZ1 — survives a URE and a second failure mid-rebuild), shorter rebuilds (distributed RAID, smaller groups, hot spares — less exposure window), and better drives (lower AFR, lower URE rate, enterprise over consumer). And none of it replaces a backup — MTTDL covers hardware failure, not deletion, corruption or ransomware.
Use the calculator to see fault tolerance, rebuild time and URE risk for any layout, then choose the level whose reliability matches the data's value.