UK’s trusted IT infrastructure partner since 2003
Servnet
ConfiguratorGet in Touch
NVMe-oF in the data centre: disaggregating flash from compute (UK 2026) — analysisNVMe-oF in the data centre: disaggregating flash from compute (UK 2026) — analysis — reach
Server Infrastructure · Storage

NVMe-oF in the data centre: disaggregating flash from compute (UK 2026)

Servnet Editorial · Server Infrastructure Practice12 min read

For years the fastest storage in a server lived inside that server, bolted to its PCIe bus, stranded the moment the box was busy or idle. NVMe over Fabrics breaks that link. It lets a pool of flash sit in its own enclosure and present drives to many hosts across a network at something very close to local latency, so capacity and performance can be allocated where the work actually is rather than where the drives happen to be bolted. For UK teams building dense virtualisation, analytics or AI estates, disaggregating flash is one of the more consequential architecture shifts of the decade, and it is worth understanding before you buy your next tray of NVMe.

Disaggregated flash: target serving initiators
NVMeNVMefabricHost AinitiatorHost BinitiatorLossless fabricRoCE / TCPNVMe targetflash enclosure

What NVMe-oF actually does

NVMe-oF carries the NVMe command set over a network instead of over the local PCIe bus, so a remote drive behaves to the operating system almost exactly like a local one. The protocol was designed to preserve the low queue-depth, low-overhead character of NVMe, which is why a well-built fabric adds only single-digit microseconds of latency over a direct attachment rather than the milliseconds a traditional SAN block protocol would impose.

The practical effect is disaggregation: flash lives in a target enclosure, hosts become initiators, and the network in between is fast and lossless enough that the distance stops mattering. You stop sizing every server for its peak storage need and instead size a shared flash pool once, then carve it up. That is the same logic that made shared storage attractive in the first place, except now it is fast enough for workloads that previously demanded local NVMe.

Three transports: RoCE, TCP and Fibre Channel

There are three mainstream ways to carry NVMe-oF, and the choice shapes cost, latency and operational skill. RoCE (RDMA over Converged Ethernet) gives the lowest latency by letting the NIC move data directly into host memory, but it needs a carefully configured lossless Ethernet fabric with priority flow control, which is real network engineering. It is the choice when latency is the whole point.

NVMe/TCP runs over ordinary Ethernet with no special fabric tuning, which makes it by far the easiest to deploy on hardware you already own. It costs a little more latency than RoCE but is good enough for a large share of workloads, and it has become the pragmatic default for teams who want disaggregation without rebuilding their network. NVMe over Fibre Channel suits shops with an existing FC SAN investment and the discipline that goes with it, reusing the fabric they already trust.

  • RoCE: lowest latency, needs a lossless tuned Ethernet fabric and the skills to run it
  • NVMe/TCP: runs on standard Ethernet, easiest to adopt, slightly higher latency
  • NVMe/FC: reuses an existing Fibre Channel SAN and its operational discipline
  • Pick the transport from your latency target and the network skills you actually have

The hardware behind a target

A disaggregated flash target is, at heart, a dense NVMe enclosure with fast network ports and enough controller capability to serve the drives without becoming the bottleneck. That means high-lane-count PCIe to fan out to many drives, dual high-speed NICs sized to the aggregate drive bandwidth, and a controller or HBA path chosen for pass-through rather than legacy RAID, since the resilience usually lives in the storage software above. Our host bus adapters range covers the controller side of that build.

Drive selection matters as much as the enclosure. A target that serves many initiators sees a blended, often write-heavy profile, so endurance class and consistent latency under load matter more than headline sequential numbers. We size enclosures, fabrics and media together rather than in isolation; the flash itself comes from our SSD and NVMe range, matched to the read/write mix the pool will actually carry.

Where disaggregation pays and where it does not

Disaggregation pays when storage need and compute need grow at different rates, when you want to drive utilisation up by sharing an expensive flash pool across many hosts, or when diskless or thin-provisioned hosts simplify your fleet. Analytics clusters, large virtualisation estates and AI pipelines that must keep accelerators fed are natural fits, and the pattern sits comfortably alongside dense storage platforms such as those we build on HPE Apollo.

It does not pay when a workload is small, self-contained and happy with the local NVMe it already has; adding a fabric there is complexity for no gain. It also does not pay if you cannot commit to running the network properly, because a poorly tuned lossless fabric will undo the latency advantage that justified the whole exercise. Disaggregation is a deliberate architecture choice, not a default upgrade.

NVMe-oF transports compared
RoCENVMe/TCPNVMe/FCLatencyLowestLowLowFabric neededLossless tunedStandard EthernetFibre ChannelEase of deployNetwork skillEasiestReuse SANBest forLatency-criticalMost workloadsExisting FC shops

Resilience and the network as a dependency

When flash leaves the server, the network becomes a storage dependency, which changes how you think about resilience. Dual fabrics, multipath from initiator to target, and redundant NICs and ports stop being nice-to-haves and become the difference between a degraded path and an outage. The target enclosure itself needs the usual dual power and redundant components, but the bigger shift is treating the fabric with the same seriousness you would treat a SAN.

Right-sized, that dependency is a feature rather than a risk: multipath and a healthy fabric let you lose a NIC, a switch or a cable and keep serving. The design work is in sizing the fabric for the aggregate bandwidth of the pool plus headroom, and in making sure congestion control is configured so one noisy host cannot starve the rest. We design both the enclosure and the fabric in our server configuration service.

Putting it together

If you are weighing disaggregated flash, start from the workload and the network, not the enclosure. Decide whether your latency target needs RoCE or whether NVMe/TCP on the Ethernet you already run is enough, size the target for aggregate bandwidth and a realistic write mix, and build the fabric for redundancy from the outset. For the upstream architecture question of whether to grow by adding nodes or scaling a single system, read scale-out vs scale-up storage, and for the media-economics side of filling these enclosures see HDD vs QLC vs TLC tiering.

Key takeaways
  • NVMe-oF carries the NVMe command set over a network, giving remote flash near-local latency.
  • RoCE is lowest-latency but needs a lossless tuned fabric; NVMe/TCP is the easiest to adopt.
  • A good target is a dense NVMe enclosure with fast NICs, pass-through controllers and the right endurance class.
  • Disaggregation pays when storage and compute grow apart or you need to raise flash utilisation.
  • When flash leaves the server the network becomes a storage dependency: design dual fabrics and multipath.
Frequently asked

FAQs — NVMe-oF in the data centre

Transports

Should I use RoCE or NVMe/TCP?

RoCE gives the lowest latency but needs a lossless Ethernet fabric with priority flow control and the skills to run it. NVMe/TCP works on ordinary Ethernet with no special tuning and is good enough for most workloads, which is why it is the pragmatic default. We match the transport to your latency target in server configuration.

Does NVMe-oF add much latency over local NVMe?

A well-built fabric adds only single-digit microseconds over a direct PCIe attachment, far less than a traditional block SAN. That is the whole point of NVMe-oF: it preserves the low-overhead character of NVMe across a network, so a remote drive behaves almost like a local one to the host.

When to use it

When is disaggregated flash worth the complexity?

When storage and compute need grow at different rates, when sharing an expensive flash pool raises utilisation, or when diskless hosts simplify the fleet. It is not worth it for small self-contained workloads happy with local NVMe, or if you cannot commit to running the fabric properly. We design the enclosure and fabric on dense storage hardware.

Related

Got a question this article didn't answer?

One conversation with an engineer who's done this before. No sales script.

Talk to Servnet →