UK’s trusted IT infrastructure partner since 2003
sales@servnetuk.com
0800 987 4111
Servnet
ConfiguratorGet in Touch
NVIDIA Hopper Architecture · GH100

NVIDIA H200 NVL PCIe
141 GB HBM3e · 4.8 TB/s

The highest-capacity PCIe GPU available. NVIDIA H200 NVL PCIe delivers 141GB of HBM3e at 4.8 TB/s — enabling single-card inference of 70B+ parameter models without model partitioning, and FP8 Transformer Engine for up to 3,958 TOPS of AI compute.

All specifications from NVIDIA official datasheet. No pricing displayed — contact Servnet for current UK availability and quote.

NVIDIA
Data Centre GPU
H200 NVL PCIe
141 GB HBM3e
Memory Bandwidth
4.8 TB/s
FP8 Compute
3,958 TOPS
Hopper GH100 · TSMC 4N · 80B transistors
141 GB
HBM3e VRAM
4.8 TB/s
Memory bandwidth
3,958
TOPS FP8 (sparsity)
16,896
CUDA Cores
7× MIG
Instances at 14GB
PCIe 5.0
x16 host interface
Technical Specifications

NVIDIA H200 NVL PCIe — Full Datasheet Specifications

All figures from the official NVIDIA H200 NVL PCIe product datasheet. Performance values with sparsity assume 2:4 structured sparse format. TFLOPS figures are peak theoretical rates based on GPU Boost clock.

Architecture
GPU ArchitectureNVIDIA Hopper (GH100)
CUDA Cores16,896
Tensor Cores528 (4th-generation)
RT Cores132 (3rd-generation)
Process NodeTSMC 4N
Transistors80 billion
Memory
VRAM141 GB HBM3e
Memory Bandwidth4.8 TB/s
Memory Bus Width5,120-bit
ECCYes — full memory ECC
Compute Performance
FP64 Tensor Core67.5 TFLOPS
FP32 (CUDA)67 TFLOPS
TF32 Tensor Core989 TFLOPS (with sparsity)
FP16 Tensor Core1,979 TFLOPS (with sparsity)
BF16 Tensor Core1,979 TFLOPS (with sparsity)
FP8 Tensor Core3,958 TOPS (with sparsity)
INT8 Tensor Core3,958 TOPS (with sparsity)
Connectivity
Form FactorPCIe — dual-slot, FHFL (full-height, full-length)
Host InterfacePCIe 5.0 x16 — 128 GB/s bidirectional
NVLink600 GB/s — NVLink bridge for 2× H200 NVL pair
Display OutputsNone (data-centre headless)
Platform
Multi-Instance GPUYes — up to 7 MIG instances at 14GB HBM3e each
NVIDIA AI EnterpriseSupported (CUDA, TensorRT, NeMo, Triton)
TDP350W (configurable 300W–350W)
CoolingPassive (requires system-level airflow)
Thermal InterfaceVapour chamber

Source: NVIDIA H200 NVL PCIe Datasheet (DS-10581-001_v01). All trademarks are the property of NVIDIA Corporation.

PCIe GPU Comparison

H200 vs H100 vs A100 PCIe

GPU ModelVRAMBandwidthFP8 TOPSPCIe GenTDP
THIS PAGEH200 NVL PCIe141GB HBM3e4.8 TB/s3,958 TOPS5.0350W
H100 PCIe80GB HBM2e2.0 TB/s3,341 TOPS5.0350W
A100 PCIe80GB HBM2e1.9 TB/sN/A4.0300W

Source: NVIDIA datasheets. H200 PCIe NVL spec (DS-10581-001_v01), H100 PCIe (DS-10167-001), A100 PCIe 80GB (DS-10010-001).

Applications

When to Choose the H200 NVL PCIe

The H200 PCIe is best justified when memory capacity and bandwidth are the primary bottleneck — not raw compute. If model size exceeds 80GB or memory bandwidth limits token throughput, the H200 NVL is the PCIe-format answer.

🧠

LLM Inference — 70B+ Models

141GB HBM3e enables single-card inference of 70B parameter models (Llama 3 70B, Mistral 70B) at full precision. No model partitioning or tensor parallelism needed at 70B scale.

👁

Multi-Modal AI

Large vision-language models (LLaVA, Flamingo, CogVLM) require GPU memory beyond 80GB. H200 PCIe handles multi-modal transformers without multi-GPU tensor splitting at moderate batch sizes.

LLM Training

4.8 TB/s memory bandwidth reduces data-starvation bottlenecks in transformer forward and backward passes. Critical for attention head computation and gradient accumulation.

🔀

AI Inference Serving

MIG partitioning creates up to 7 isolated 14GB GPU instances for serving multiple concurrent AI models with guaranteed QoS — isolating workloads between tenants or services.

🔬

Scientific HPC

67 TFLOPS FP64 Tensor Core performance accelerates molecular dynamics, climate simulation, and computational chemistry workloads that require double-precision accuracy.

📊

Data Analytics

RAPIDS cuDF and cuML leverage HBM3e bandwidth for GPU-accelerated Spark, pandas-equivalent data frames, and ML training on large tabular datasets without CPU bottlenecks.

Compare & Related Products

Ready to specify H200 PCIe into your AI infrastructure?

Servnet advises on H200 PCIe vs SXM, server PSU requirements (80+ Platinum for 350W card), slot compatibility, and provides UK lead time and availability quotes.

Request H200 NVL QuoteView All GPU Cards →