A rebuild (or ZFS resilver) is when the array reconstructs a failed drive — and it's the riskiest moment in an array's life. Here's how long it takes, why it's risky, and how to shrink both. Estimate it in the RAID calculator.
What happens during a rebuild
When a drive fails, the array reconstructs its contents onto a replacement (or hot spare): mirrors copy from the surviving half, parity arrays read all surviving members and recompute the missing data. Until that finishes, the array runs degraded — and for single-parity levels, with no redundancy left.
Rebuild time depends mainly on drive capacity and the rebuild rate, which is throttled by ongoing workload. As a rough guide, a large nearline HDD rebuilds at tens of MB/s under load, so a multi-TB drive can take many hours to days. The calculator gives an indicative figure (clearly labelled an estimate).
Why rebuilds are risky
Two things can go wrong during the window. First, an unrecoverable read error (URE) on a surviving drive: in single-parity RAID 5 / RAIDZ1 there's no parity left, so that's data loss (see is RAID 5 dead?). Second, a second drive failure — the rebuild stresses every drive, and a same-batch sibling may fail too. Dual parity (RAID 6 / RAIDZ2) survives both a URE and one extra failure during a single-drive rebuild.
The bigger the drives, the longer the window and the more bits are read, so the higher both risks. This is why dual parity is the default on large-capacity arrays.
Shrinking rebuild time and risk
Distributed RAID (Dell ADAPT, HPE distributed RAID) and ZFS resilver only rebuild the used data and spread the work across many drives, finishing far faster than classic dedicated-parity rebuilds. Nesting (RAID 50/60) splits a big pool into smaller, faster-rebuilding groups. A hot spare removes the wait for a human.
The strongest combination on big drives: dual (or triple) parity + distributed rebuild + a hot spare — fast reconstruction with redundancy still in reserve.