AI Servers: The 2026 Data Study
What GPU/AI servers are, how they are built, what they cost, and how to buy them — a cited, UK-focused reference.
Last updated 5 July 2026 · every figure is sourced and dated · specifications verified against vendor datasheets.
What an AI server actually is
Almost every high-end AI server is built around NVIDIA’s HGX 8-GPU SXM baseboard: eight GPUs on one carrier, wired all-to-all by NVLink and NVSwitch rather than PCIe, so they behave like one large accelerator.[src] SXM is what enables that mesh; cheaper PCIe add-in cards are used for inference and smaller deployments but can only bridge GPUs in pairs.[src] Around the GPUs sits a dual-socket CPU host with 1–2 TB of RAM,[src] and a high-speed east-west fabric — InfiniBand (deterministic latency, in-network reductions) or 400/800GbE Ethernet (broader ecosystem) — to scale training across many nodes.
NVLink bandwidth is the defining metric of the platform, and it has doubled with Blackwell:
Training vs inference — one box can’t do both well
Big, synchronised GPU clusters advancing in lockstep. Wants maximum HBM capacity, top FP8/FP4 compute, the full 8-GPU NVLink mesh and 400/800G scale-out for all-reduce. Runs hot and near-continuously — the case for liquid cooling and owned/colo hardware.
Latency-bound and often memory-bandwidth-bound — token throughput tracks HBM bandwidth, not raw FLOPS. Can run on lower-power PCIe GPUs, smaller nodes and commodity Ethernet, distributed closer to users. The highest cumulative volume of AI compute.
Workload/infrastructure split per industry analysis of training vs inference server design.
Data-centre GPU spec comparison
Every figure is from the manufacturer’s datasheet. These are static hardware specs (no pricing), so they don’t go stale.
| GPU | Arch | VRAM | Bandwidth | TDP | Interconnect |
|---|---|---|---|---|---|
| NVIDIA H100 SXM5[src] | Hopper | 80 GB HBM3 | 3.35 TB/s | up to 700 W | NVLink 4th-gen 900 GB/s |
| NVIDIA H100 PCIe[src] NVLink bridge is 900 GB/s (reuses A100 bridges); the 600 GB/s figure belongs to the dual-GPU H100 NVL card. | Hopper | 80 GB HBM2e | 2.0 TB/s | 350 W | NVLink bridge 900 GB/s (2-GPU) |
| NVIDIA H200 SXM[src] | Hopper | 141 GB HBM3e | 4.8 TB/s | up to 700 W | NVLink 4th-gen 900 GB/s |
| NVIDIA B200 SXM[src] 192 GB is the raw dual-die capacity; DGX/HGX B200 configures 180 GB/GPU. TDP is a configurable maximum. | Blackwell | 180 GB HBM3e (DGX/HGX) | 8 TB/s | ~1,000 W (config. max) | NVLink 5th-gen 1.8 TB/s |
| NVIDIA GB200 (superchip)[src] | Grace + 2× Blackwell | 372 GB HBM3e + 480 GB LPDDR5X | 16 TB/s (HBM) | superchip | NVLink-C2C 3.6 TB/s per superchip |
| NVIDIA L40S[src] No NVLink — an inference/graphics part, not a training GPU. | Ada Lovelace | 48 GB GDDR6 (ECC) | 864 GB/s | 350 W | No NVLink; PCIe Gen4 x16 |
| NVIDIA A100 80 GB SXM[src] | Ampere | 80 GB HBM2e | 2.04 TB/s | 400 W | NVLink 3rd-gen 600 GB/s |
| AMD Instinct MI300X[src] The leading non-NVIDIA alternative. | CDNA 3 | 192 GB HBM3 | 5.3 TB/s | 750 W (TBP) | Infinity Fabric; PCIe Gen5 x16 |
| AMD Instinct MI325X[src] | CDNA 3 | 256 GB HBM3e | 6.0 TB/s | 1,000 W (TBP) | Infinity Fabric; PCIe Gen5 |
The AI-server platforms
| Model | Vendor | GPUs | Form | Cooling |
|---|---|---|---|---|
| PowerEdge XE9680[src]Servnet supplies | Dell | 8× HGX H100/H200 SXM (or MI300X / Gaudi3 OAM) | 6U | Air |
| PowerEdge XE9640[src]Servnet supplies | Dell | 4× H100 SXM (NVLink) or Intel Max OAM | 2U | Direct liquid (facility water) |
| PowerEdge XE8640[src]Servnet supplies | Dell | 4× HGX H100 SXM5 | 4U | Closed-loop liquid + fans |
| ProLiant DL380a Gen12[src]Servnet supplies | HPE | up to 8× double-wide PCIe (RTX PRO 6000, H200 NVL, H100 NVL, L40S) | 4U | Air or direct liquid |
| Cray XD670[src]Servnet supplies | HPE | 8× HGX H100/H200 SXM5 | 5U | Air, with liquid option |
| ThinkSystem SR675 V3[src]Servnet supplies | Lenovo | up to 8× PCIe, or 4× HGX H200 SXM (NVLink) | 3U | Neptune hybrid (HGX variant) |
| ThinkSystem SR780a V3[src]Servnet supplies | Lenovo | 8× HGX H100/H200/B200 (NVLink 900 GB/s) | 5U | Neptune direct liquid + air |
| SYS-821GE-TNHR[src] | Supermicro | 8× HGX H100/H200 (NVLink + NVSwitch) | 8U | Air (liquid variants exist) |
| DGX H100 / H200[src] | NVIDIA | 8× H100 (640 GB) / H200 (1,128 GB) SXM | 8U | Air |
| DGX B200[src] | NVIDIA | 8× Blackwell (1,440 GB HBM3e) | 10U | Air |
| GB200 NVL72[src] | NVIDIA | 72× Blackwell + 36 Grace (rack-scale) | Rack | 100% liquid |
| G593-SD0[src] | Gigabyte | 8× HGX H100 SXM5 | 5U | Air |
Servnet supplies and configures the Dell, HPE and Lenovo platforms above — including the 8-GPU Lenovo ThinkSystem SR780a V3. Build one to your spec in the Lenovo configurator, or explore the ThinkSystem range.
Power & cooling: why AI breaks the rack
The single biggest practical shock for buyers is power density. A traditional enterprise rack is provisioned for perhaps 6–15 kW. A single 8-GPU AI node can consume that on its own, and a rack-scale Blackwell system is an order of magnitude beyond it — which is why NVIDIA’s Blackwell generation is designed around direct-liquid cooling.[src]
Practical read: many Hopper (H100/H200) servers still run air-cooled and fit high-density enterprise or colo racks; Blackwell largely does not. Plan power, cooling and rack space before the hardware.
What it costs: own vs cloud
Acquisition prices for AI servers and data-centre GPUs are almost entirely quote-gated and volatile, so we don’t publish a headline “street price” — anyone who does is guessing. What is knowable is the shape of the decision and the current cloud rental rates.
AWS cut on-demand NVIDIA GPU EC2 pricing by up to ~45% (44% off P5/H100), effective 1 June 2025.[src] On-demand list rates move frequently and vary by region — treat as indicative and verify live before deciding.
The durable rule: rent for short, bursty or uncertain workloads; own (or colocate) when utilisation is high and sustained over years. At continuous use, an H100 rents for roughly the price of buying one within a year — so the more steadily you run GPUs, the more owning pays. The exact break-even depends on your utilisation, GPU choice and current cloud pricing — model your own case in the AI GPU calculator (it carries its own cited cloud-vs-own comparison), and finance the capex over 2–5 years with the IT finance calculator if you buy.
The market backdrop
Figures from named analysts (McKinsey, Gartner). Enterprise “AI repatriation” to on-prem/colo is a real trend but is largely evidenced by vendor-sponsored surveys, so we cite only the named-analyst anchors here and treat the sponsored numbers as directional.
AI servers — common questions
What is an AI server?
An AI server is a GPU-dense server built to train or run AI models. Most are built on NVIDIA’s HGX 8-GPU SXM baseboard, where eight GPUs are linked all-to-all by NVLink and NVSwitch (far faster than PCIe). A CPU host, large system memory and high-speed east-west networking (InfiniBand or 400/800GbE Ethernet) surround the GPUs. Examples include the Dell PowerEdge XE9680, HPE Cray XD670, Lenovo ThinkSystem SR780a V3 and NVIDIA DGX.
How much power does an AI server use?
A lot. One air-cooled 8× H100 server (NVIDIA DGX H100, 8U) draws about 10.2 kW — more than many UK enterprise racks are provisioned for at a single node. NVIDIA’s Blackwell B200 is defined at 1,000 W air-cooled or 1,200 W liquid-cooled per GPU, and a rack-scale GB200 NVL72 draws roughly 120 kW and must be liquid-cooled.
Which GPU is best for AI — H100, H200 or B200?
It depends on the workload. H100 (80 GB HBM3) is the established training GPU; H200 adds 141 GB HBM3e and 4.8 TB/s bandwidth, which helps memory-bound inference; B200 (Blackwell, 180 GB HBM3e in DGX/HGX, 8 TB/s) is the current top training part. For lighter inference, PCIe parts like the L40S (48 GB, no NVLink) are cheaper and lower-power. AMD’s Instinct MI300X (192 GB) and MI325X (256 GB) are the main non-NVIDIA alternatives.
Is it cheaper to buy an AI server or rent GPUs in the cloud?
Renting is cheaper for short, bursty or uncertain workloads; owning wins at sustained high utilisation over multiple years. The tipping point depends on your utilisation, GPU choice and cloud rates (which move — AWS cut on-demand GPU pricing by up to ~45% in June 2025). The durable rule is: the higher and steadier your GPU utilisation, the more owning or colocating pays. Model your own case with our AI GPU calculator.
Why do AI servers need liquid cooling?
Because air can only remove so much heat from a rack — typically up to ~8–25 kW. A single Blackwell node or a GB200 NVL72 rack (~120 kW) is far past that ceiling, so NVIDIA’s Blackwell generation is designed around direct-liquid cooling (DLC). Many Hopper (H100/H200) servers can still run air-cooled, which is why they remain practical for enterprises without liquid-ready facilities.
Do I need special data-centre facilities for AI servers?
Usually yes. Rack power density is the first constraint — one 8-GPU node can exceed a whole traditional rack’s power budget, and Blackwell adds liquid-cooling plumbing. Options are upgrading your own facility, using a colocation provider with high-density/liquid-ready halls, or renting cloud GPUs. Lead times and GPU availability are also real planning factors.
Methodology, sources & how to cite
Hardware specifications are taken directly from manufacturer datasheets (NVIDIA, AMD) and OEM product pages (Dell, HPE, Lenovo, Supermicro, NVIDIA), and were cross-checked in an adversarial fact-check pass — during which one figure (the H100 PCIe NVLink bridge) was corrected to 900 GB/s. Market figures are attributed to named analysts (McKinsey, Gartner). Cloud prices are dated and labelled indicative because they change frequently; we publish no hardware “street prices” because those are quote-gated and volatile. Last updated 5 July 2026.
Servnet (2026) “AI Servers: The 2026 Data Study.” servnetuk.com/research/ai-servers, accessed {date}.Full source list (24)
- NVIDIA — HGX Platform · primary
- NVIDIA — NVLink & NVLink Switch · primary
- NVIDIA — H100 Tensor Core GPU · primary
- NVIDIA — H100 PCIe Product Brief (PB-11133-001) · primary
- NVIDIA — H200 Tensor Core GPU · primary
- NVIDIA — DGX B200 · primary
- NVIDIA — GB200 NVL72 · primary
- NVIDIA — L40S · primary
- NVIDIA — A100 datasheet · primary
- NVIDIA — DGX H100/H200 User Guide · primary
- AMD — Instinct MI300X data sheet · primary
- AMD — Instinct MI325X data sheet · primary
- Dell — PowerEdge XE9680 · vendor
- Dell — PowerEdge XE9640 · vendor
- Dell — PowerEdge XE8640 spec sheet · vendor
- HPE — ProLiant Compute DL380a Gen12 · vendor
- HPE — Cray XD670 QuickSpecs · vendor
- Lenovo Press — ThinkSystem SR675 V3 · vendor
- Lenovo Press — ThinkSystem SR780a V3 · vendor
- Supermicro — SYS-821GE-TNHR · vendor
- Gigabyte — G593-SD0 · vendor
- McKinsey — “The cost of compute” (2025) · analyst
- Gartner — geopolitics & digital sovereignty survey (Nov 2025) · analyst
- AWS — up to 45% GPU EC2 price reduction (Jun 2025) · primary
Planning an AI server or GPU cluster?
Servnet specs, supplies and supports Dell, HPE and Lenovo AI servers for UK organisations — with the power, cooling, networking and finance worked out. Tell us the workload; we’ll design the build.
Talk to a UK specialist
Get expert advice or a no-obligation quote — servers, storage, networking, maintenance, finance and cloud. We reply the same working day.