UK’s trusted IT infrastructure partner since 2003
Servnet
ConfiguratorGet in Touch
Designing a one-petabyte storage server in 2026: a UK reference architecture — analysisDesigning a one-petabyte storage server in 2026: a UK reference architecture — analysis — reach
Storage Infrastructure · Architecture

Designing a one-petabyte storage server in 2026: a UK reference architecture

Servnet Editorial · Server Infrastructure Practice12 min read

A petabyte used to be a data-centre's worth of storage; in 2026 it is a design exercise that fits in a few rack units if you get the architecture right. But assembling a petabyte is not simply a matter of buying enough disks, because the way you reach that capacity, the media you choose, the redundancy scheme and the path from drives to network all decide whether the result is fast, resilient and affordable or merely large. This is a reference architecture for building a one-petabyte storage server in the UK: how to get to a petabyte, the media economics, and the engineering that keeps it serviceable.

Reaching a petabyte: head node plus JBOD
SASSASserveDense head nodeApollo + first drivesJBOD shelf 1SAS expansionJBOD shelf 2SAS expansionNetwork25/100GbE front-end

Two ways to reach a petabyte

There are broadly two routes to a petabyte. The first is density within a chassis: a storage-dense server that packs as many large drives as possible into a standard-depth box, reaching enormous capacity in a small footprint. The second is expansion: a head node attached to one or more JBOD shelves that add drive bays without adding servers, daisy-chained over SAS so the capacity grows beyond what a single chassis holds.

The two combine in practice. A dense head node, such as a member of the HPE Apollo family, provides the compute and a large first tranche of drives, and JBOD expansion shelves extend it to a petabyte and beyond. The choice of how much to put in the head node versus the shelves is an early architectural decision that affects cost, serviceability and how the capacity scales, so it is worth making deliberately rather than by accident.

Media economics: HDD, hybrid or all-QLC

The single biggest lever on the cost and character of a petabyte is the media. High-capacity hard drives remain the cheapest path to bulk capacity and are the right answer for archival, backup and capacity-led data where throughput matters more than latency. An all-flash petabyte, typically built on dense QLC NAND, is far faster and denser but more expensive per terabyte, and is justified where the workload genuinely needs the performance across the whole dataset.

Most petabyte designs land in between, as a hybrid: a flash tier for the hot, latency-sensitive data and the metadata, with the bulk of the capacity on hard drives. This captures most of the performance benefit where it is felt while keeping the cost per usable terabyte close to the HDD floor for the cold majority. The right media mix is a workload decision, and getting it right is where the economics of a petabyte are won or lost.

  • HDD: cheapest bulk capacity, best for archive, backup and capacity-led data
  • All-QLC flash: fastest and densest, justified when the whole dataset needs speed
  • Hybrid: flash for hot data and metadata, HDD for the cold bulk - usually the sweet spot
  • Match the media mix to how the dataset is actually accessed

Redundancy at petabyte scale

Redundancy choices that are fine on a small pool become critical at a petabyte, because the drives are large and rebuilds are correspondingly slow and exposed. Simple single-parity schemes are risky across multi-terabyte drives, since a second failure during a long rebuild can lose data, so petabyte designs lean on double-parity or erasure coding that can tolerate more than one concurrent failure across the set.

Erasure coding, in particular, is the workhorse of large-capacity systems because it delivers resilience with far less capacity overhead than full mirroring, which matters enormously when you are protecting a petabyte. The trade is more compute to encode and decode and longer rebuilds, so the redundancy scheme, the drive size and the recovery behaviour have to be designed together. This is the same rebuild-risk discipline that drives drive-size choices throughout dense storage.

The data path: drives to network

A petabyte of drives is useless if the path to them is a bottleneck, so the engineering between disk and network matters as much as the capacity. Drives connect through SAS expanders and host bus adapters in pass-through mode to the storage software, which manages the pool and presents it; the HBA and expander layout has to be sized so it is not the choke point for the aggregate throughput of all those drives.

At the front, the network has to be able to carry the bandwidth the workload demands, which for a petabyte often means multiple high-speed links, 25, 100GbE or more, sized to the use case rather than the capacity. A large backup target needs steady throughput; a media or analytics tier needs far more. Get the internal path and the external network in proportion to the drives, or the petabyte will feel slow regardless of how much flash you bought.

Indicative cost per usable PB by media
40302010010All-HDD16Hybrid34All-QLCRelative cost/PB

Serviceability and the cost of downtime

At a petabyte, serviceability stops being a nicety and becomes part of the design, because the more drives you have, the more often one will fail, and the architecture has to make that a routine, non-disruptive event. Hot-swap drive bays, redundant power and cooling, and a pool that tolerates and recovers from drive loss without taking the system offline are what keep a petabyte running rather than lurching from outage to outage.

The boot device should sit on its own mirrored pair, separate from the data, so the operating environment is independent of the pool it serves. And because rebuilds at this scale take time, the design has to assume drives will fail during other drives' rebuilds and survive it. A petabyte built without this discipline is not a storage system, it is a large and fragile pile of disks. We engineer the resilience and serviceability together; see our Dell storage options for shared-array alternatives.

Putting it together

A one-petabyte server is a balanced design, not a disk count: reach the capacity through a dense head node plus JBOD expansion, choose a media mix, usually hybrid, that matches how the data is accessed, protect it with double-parity or erasure coding sized against rebuild risk, and proportion the HBA path and network to the aggregate throughput. Build in hot-swap serviceability and a separate mirrored boot so failures are routine. We design petabyte architectures around the HPE Apollo family and the right SSD and NVMe tiers.

Key takeaways
  • Reach a petabyte through a dense head node plus JBOD expansion, deciding the split deliberately.
  • Media is the biggest cost lever: HDD for bulk, all-QLC for speed, hybrid as the usual sweet spot.
  • At petabyte scale use double-parity or erasure coding; single-parity rebuilds across large drives are risky.
  • Proportion the HBA and expander path and the front-end network to the aggregate throughput of all the drives.
  • Design for serviceability - hot-swap bays, redundant power, a separate mirrored boot, and rebuild-tolerant pools.
Frequently asked

FAQs — Designing a one-petabyte storage server in 2026

Reaching a petabyte

How do you build a one-petabyte storage server?

Through a combination of a storage-dense head node and JBOD expansion shelves that add drive bays over SAS, then a media mix matched to the workload. The split between head node and shelves is an early architectural choice. We design petabyte builds around the HPE Apollo family and the right drive tiers.

Should a petabyte server use HDDs or flash?

Usually a hybrid: high-capacity HDDs for the cold bulk, a flash tier for hot data and metadata. All-HDD is cheapest for archive and backup; all-QLC flash is fastest where the whole dataset needs performance. The mix is a workload decision we make with our SSD and NVMe guidance.

Resilience and performance

What redundancy should a petabyte use?

Double-parity or erasure coding, not single parity, because large drives make rebuilds slow and a second failure during a long rebuild can lose data. Erasure coding gives resilience with far less overhead than mirroring, at the cost of more compute. We design the scheme, drive size and recovery together.

What stops a petabyte server from being slow?

Proportioning the internal path and the network to the drives: HBAs and SAS expanders sized so they are not the choke point, and multiple high-speed front-end links matched to the workload. A petabyte feels slow if the data path is a bottleneck regardless of capacity. For shared alternatives see our Dell storage options.

Related

Got a question this article didn't answer?

One conversation with an engineer who's done this before. No sales script.

Talk to Servnet →