Servnet Research · Data Study

AI Servers: The 2026 Data Study

What GPU/AI servers are, how they are built, what they cost, and how to buy them — a cited, UK-focused reference.

Last updated 5 July 2026 · every figure is sourced and dated · specifications verified against vendor datasheets.

~10.2 kW

Power drawn by one air-cooled 8× H100 server (NVIDIA DGX H100, 8U)^[src]

~120 kW

Power per GB200 NVL72 rack — 100% liquid-cooled, beyond any air-cooled rack^[src]

1.8 TB/s

Per-GPU NVLink bandwidth on Blackwell (5th-gen) — 2× Hopper’s 900 GB/s^[src]

~$6.7tn

Global data-centre capex needed by 2030 to meet compute demand (McKinsey)^[src]

Size an AI build → GPU calculator Download the data (JSON)

Architecture

What an AI server actually is

Almost every high-end AI server is built around NVIDIA’s HGX 8-GPU SXM baseboard: eight GPUs on one carrier, wired all-to-all by NVLink and NVSwitch rather than PCIe, so they behave like one large accelerator.^[src] SXM is what enables that mesh; cheaper PCIe add-in cards are used for inference and smaller deployments but can only bridge GPUs in pairs.^[src] Around the GPUs sits a dual-socket CPU host with 1–2 TB of RAM,^[src] and a high-speed east-west fabric — InfiniBand (deterministic latency, in-network reductions) or 400/800GbE Ethernet (broader ecosystem) — to scale training across many nodes.

NVLink bandwidth is the defining metric of the platform, and it has doubled with Blackwell:

Source: NVIDIA NVLink & NVLink Switch.^[src] Blackwell reaches 1.8 TB/s per GPU via 18 links × 100 GB/s.

Workloads

Training vs inference — one box can’t do both well

Training

Big, synchronised GPU clusters advancing in lockstep. Wants maximum HBM capacity, top FP8/FP4 compute, the full 8-GPU NVLink mesh and 400/800G scale-out for all-reduce. Runs hot and near-continuously — the case for liquid cooling and owned/colo hardware.

Inference / serving

Latency-bound and often memory-bandwidth-bound — token throughput tracks HBM bandwidth, not raw FLOPS. Can run on lower-power PCIe GPUs, smaller nodes and commodity Ethernet, distributed closer to users. The highest cumulative volume of AI compute.

Workload/infrastructure split per industry analysis of training vs inference server design.

The GPUs

Data-centre GPU spec comparison

Every figure is from the manufacturer’s datasheet. These are static hardware specs (no pricing), so they don’t go stale.

GPU	Arch	VRAM	Bandwidth	TDP	Interconnect
NVIDIA H100 SXM5^[src]	Hopper	80 GB HBM3	3.35 TB/s	up to 700 W	NVLink 4th-gen 900 GB/s
NVIDIA H100 PCIe^[src] NVLink bridge is 900 GB/s (reuses A100 bridges); the 600 GB/s figure belongs to the dual-GPU H100 NVL card.	Hopper	80 GB HBM2e	2.0 TB/s	350 W	NVLink bridge 900 GB/s (2-GPU)
NVIDIA H200 SXM^[src]	Hopper	141 GB HBM3e	4.8 TB/s	up to 700 W	NVLink 4th-gen 900 GB/s
NVIDIA B200 SXM^[src] 192 GB is the raw dual-die capacity; DGX/HGX B200 configures 180 GB/GPU. TDP is a configurable maximum.	Blackwell	180 GB HBM3e (DGX/HGX)	8 TB/s	~1,000 W (config. max)	NVLink 5th-gen 1.8 TB/s
NVIDIA GB200 (superchip)^[src]	Grace + 2× Blackwell	372 GB HBM3e + 480 GB LPDDR5X	16 TB/s (HBM)	superchip	NVLink-C2C 3.6 TB/s per superchip
NVIDIA L40S^[src] No NVLink — an inference/graphics part, not a training GPU.	Ada Lovelace	48 GB GDDR6 (ECC)	864 GB/s	350 W	No NVLink; PCIe Gen4 x16
NVIDIA A100 80 GB SXM^[src]	Ampere	80 GB HBM2e	2.04 TB/s	400 W	NVLink 3rd-gen 600 GB/s
AMD Instinct MI300X^[src] The leading non-NVIDIA alternative.	CDNA 3	192 GB HBM3	5.3 TB/s	750 W (TBP)	Infinity Fabric; PCIe Gen5 x16
AMD Instinct MI325X^[src]	CDNA 3	256 GB HBM3e	6.0 TB/s	1,000 W (TBP)	Infinity Fabric; PCIe Gen5

The landscape

The AI-server platforms

Model	Vendor	GPUs	Form	Cooling
PowerEdge XE9680^[src]Servnet supplies	Dell	8× HGX H100/H200 SXM (or MI300X / Gaudi3 OAM)	6U	Air
PowerEdge XE9640^[src]Servnet supplies	Dell	4× H100 SXM (NVLink) or Intel Max OAM	2U	Direct liquid (facility water)
PowerEdge XE8640^[src]Servnet supplies	Dell	4× HGX H100 SXM5	4U	Closed-loop liquid + fans
ProLiant DL380a Gen12^[src]Servnet supplies	HPE	up to 8× double-wide PCIe (RTX PRO 6000, H200 NVL, H100 NVL, L40S)	4U	Air or direct liquid
Cray XD670^[src]Servnet supplies	HPE	8× HGX H100/H200 SXM5	5U	Air, with liquid option
ThinkSystem SR675 V3^[src]Servnet supplies	Lenovo	up to 8× PCIe, or 4× HGX H200 SXM (NVLink)	3U	Neptune hybrid (HGX variant)
ThinkSystem SR780a V3^[src]Servnet supplies	Lenovo	8× HGX H100/H200/B200 (NVLink 900 GB/s)	5U	Neptune direct liquid + air
SYS-821GE-TNHR^[src]	Supermicro	8× HGX H100/H200 (NVLink + NVSwitch)	8U	Air (liquid variants exist)
DGX H100 / H200^[src]	NVIDIA	8× H100 (640 GB) / H200 (1,128 GB) SXM	8U	Air
DGX B200^[src]	NVIDIA	8× Blackwell (1,440 GB HBM3e)	10U	Air
GB200 NVL72^[src]	NVIDIA	72× Blackwell + 36 Grace (rack-scale)	Rack	100% liquid
G593-SD0^[src]	Gigabyte	8× HGX H100 SXM5	5U	Air

Servnet supplies and configures the Dell, HPE and Lenovo platforms above — including the 8-GPU Lenovo ThinkSystem SR780a V3. Build one to your spec in the Lenovo configurator, or explore the ThinkSystem range.

Facilities

Power & cooling: why AI breaks the rack

The single biggest practical shock for buyers is power density. A traditional enterprise rack is provisioned for perhaps 6–15 kW. A single 8-GPU AI node can consume that on its own, and a rack-scale Blackwell system is an order of magnitude beyond it — which is why NVIDIA’s Blackwell generation is designed around direct-liquid cooling.^[src]

Sources: NVIDIA DGX H100 (~10.2 kW/8U node)^[src]; NVIDIA GB200 NVL72 (~120 kW rack, 100% liquid-cooled)^[src]; B200 defined at 1,000 W air / 1,200 W liquid per GPU.^[src]

Practical read: many Hopper (H100/H200) servers still run air-cooled and fit high-density enterprise or colo racks; Blackwell largely does not. Plan power, cooling and rack space before the hardware.

Economics

What it costs: own vs cloud

Acquisition prices for AI servers and data-centre GPUs are almost entirely quote-gated and volatile, so we don’t publish a headline “street price” — anyone who does is guessing. What is knowable is the shape of the decision and the current cloud rental rates.

Indicative cloud GPU rental — on-demand, mid-2025

~$4.1 /GPU-hr

AWS (p5.48xlarge, H100)

~$3.00 /GPU-hr

Google Cloud (A3-high, H100)

~$6.98 /GPU-hr

Microsoft Azure (NC H100 v5)

AWS cut on-demand NVIDIA GPU EC2 pricing by up to ~45% (44% off P5/H100), effective 1 June 2025.^[src] On-demand list rates move frequently and vary by region — treat as indicative and verify live before deciding.

The durable rule: rent for short, bursty or uncertain workloads; own (or colocate) when utilisation is high and sustained over years. At continuous use, an H100 rents for roughly the price of buying one within a year — so the more steadily you run GPUs, the more owning pays. The exact break-even depends on your utilisation, GPU choice and current cloud pricing — model your own case in the AI GPU calculator (it carries its own cited cloud-vs-own comparison), and finance the capex over 2–5 years with the IT finance calculator if you buy.

Context

The market backdrop

~$6.7 trillion

Global data-centre capex needed by 2030 to meet compute demand (~$5.2tn of it AI-capable)^[src]

~219 GW

Projected global data-centre capacity by 2030 (nearly 3×), ~70% of new demand from AI^[src]

61%

Western-European CIOs who say geopolitics will raise reliance on local/regional cloud providers^[src]

>75%

Enterprises outside the US expected to have a digital-sovereignty strategy by 2030 (Gartner)^[src]

Figures from named analysts (McKinsey, Gartner). Enterprise “AI repatriation” to on-prem/colo is a real trend but is largely evidenced by vendor-sponsored surveys, so we cite only the named-analyst anchors here and treat the sponsored numbers as directional.

FAQ

AI servers — common questions

What is an AI server?

An AI server is a GPU-dense server built to train or run AI models. Most are built on NVIDIA’s HGX 8-GPU SXM baseboard, where eight GPUs are linked all-to-all by NVLink and NVSwitch (far faster than PCIe). A CPU host, large system memory and high-speed east-west networking (InfiniBand or 400/800GbE Ethernet) surround the GPUs. Examples include the Dell PowerEdge XE9680, HPE Cray XD670, Lenovo ThinkSystem SR780a V3 and NVIDIA DGX.

How much power does an AI server use?

A lot. One air-cooled 8× H100 server (NVIDIA DGX H100, 8U) draws about 10.2 kW — more than many UK enterprise racks are provisioned for at a single node. NVIDIA’s Blackwell B200 is defined at 1,000 W air-cooled or 1,200 W liquid-cooled per GPU, and a rack-scale GB200 NVL72 draws roughly 120 kW and must be liquid-cooled.

Which GPU is best for AI — H100, H200 or B200?

It depends on the workload. H100 (80 GB HBM3) is the established training GPU; H200 adds 141 GB HBM3e and 4.8 TB/s bandwidth, which helps memory-bound inference; B200 (Blackwell, 180 GB HBM3e in DGX/HGX, 8 TB/s) is the current top training part. For lighter inference, PCIe parts like the L40S (48 GB, no NVLink) are cheaper and lower-power. AMD’s Instinct MI300X (192 GB) and MI325X (256 GB) are the main non-NVIDIA alternatives.

Is it cheaper to buy an AI server or rent GPUs in the cloud?

Renting is cheaper for short, bursty or uncertain workloads; owning wins at sustained high utilisation over multiple years. The tipping point depends on your utilisation, GPU choice and cloud rates (which move — AWS cut on-demand GPU pricing by up to ~45% in June 2025). The durable rule is: the higher and steadier your GPU utilisation, the more owning or colocating pays. Model your own case with our AI GPU calculator.

Why do AI servers need liquid cooling?

Because air can only remove so much heat from a rack — typically up to ~8–25 kW. A single Blackwell node or a GB200 NVL72 rack (~120 kW) is far past that ceiling, so NVIDIA’s Blackwell generation is designed around direct-liquid cooling (DLC). Many Hopper (H100/H200) servers can still run air-cooled, which is why they remain practical for enterprises without liquid-ready facilities.

Do I need special data-centre facilities for AI servers?

Usually yes. Rack power density is the first constraint — one 8-GPU node can exceed a whole traditional rack’s power budget, and Blackwell adds liquid-cooling plumbing. Options are upgrading your own facility, using a colocation provider with high-density/liquid-ready halls, or renting cloud GPUs. Lead times and GPU availability are also real planning factors.

Transparency

Methodology, sources & how to cite

Hardware specifications are taken directly from manufacturer datasheets (NVIDIA, AMD) and OEM product pages (Dell, HPE, Lenovo, Supermicro, NVIDIA), and were cross-checked in an adversarial fact-check pass — during which one figure (the H100 PCIe NVLink bridge) was corrected to 900 GB/s. Market figures are attributed to named analysts (McKinsey, Gartner). Cloud prices are dated and labelled indicative because they change frequently; we publish no hardware “street prices” because those are quote-gated and volatile. Last updated 5 July 2026.

Cite this study

Servnet (2026) “AI Servers: The 2026 Data Study.” servnetuk.com/research/ai-servers, accessed {date}.

Journalists & researchers: this study is free to cite with a link to servnetuk.com/research/ai-servers. Structured data: download JSON.

Full source list (24)

NVIDIA — HGX Platform · primary
NVIDIA — NVLink & NVLink Switch · primary
NVIDIA — H100 Tensor Core GPU · primary
NVIDIA — H100 PCIe Product Brief (PB-11133-001) · primary
NVIDIA — H200 Tensor Core GPU · primary
NVIDIA — DGX B200 · primary
NVIDIA — GB200 NVL72 · primary
NVIDIA — L40S · primary
NVIDIA — A100 datasheet · primary
NVIDIA — DGX H100/H200 User Guide · primary
AMD — Instinct MI300X data sheet · primary
AMD — Instinct MI325X data sheet · primary
Dell — PowerEdge XE9680 · vendor
Dell — PowerEdge XE9640 · vendor
Dell — PowerEdge XE8640 spec sheet · vendor
HPE — ProLiant Compute DL380a Gen12 · vendor
HPE — Cray XD670 QuickSpecs · vendor
Lenovo Press — ThinkSystem SR675 V3 · vendor
Lenovo Press — ThinkSystem SR780a V3 · vendor
Supermicro — SYS-821GE-TNHR · vendor
Gigabyte — G593-SD0 · vendor
McKinsey — “The cost of compute” (2025) · analyst
Gartner — geopolitics & digital sovereignty survey (Nov 2025) · analyst
AWS — up to 45% GPU EC2 price reduction (Jun 2025) · primary

Planning an AI server or GPU cluster?

Servnet specs, supplies and supports Dell, HPE and Lenovo AI servers for UK organisations — with the power, cooling, networking and finance worked out. Tell us the workload; we’ll design the build.

Open the GPU calculator