Finance Tools Configurator Get in Touch

Server GPU UK · H200 · H100 · A100 · L40S

NVIDIA GPU PCIe add-in cards —
H200 · H100 · A100 · L40S · RTX Ada

NVIDIA data-centre and professional GPU add-in cards — distinct from full GPU server systems. PCIe cards install into any compatible PCIe 4.0/5.0 x16 server slot, providing AI training, inference, HPC acceleration, and professional visualisation without requiring a dedicated GPU server chassis.

All specifications are from official NVIDIA datasheets. Listed for technical reference — Servnet does not currently supply this product.

Full GPU Servers →

141 GB

HBM3e — H200 PCIe largest GPU memory capacity

4.8 TB/s

Memory bandwidth — NVIDIA H200 PCIe

FP8

Transformer Engine precision — H100 & H200 Hopper

3,958

TOPS FP8 — H200 with sparsity

MIG

Multi-Instance GPU — partition H100/H200 into 7 isolated GPUs

PCIe 5.0

Host interface — H100/H200 PCIe x16

GPU Add-In Cards

NVIDIA Data-Centre GPU Specifications

All performance figures from official NVIDIA datasheets. Tensor Core TFLOPS figures with sparsity assume 2:4 structured sparse activation pattern. All cards require adequate server PSU capacity and PCIe 4.0/5.0 x16 slot.

Hopper · GH100

H200 NVL PCIe

NVIDIA H200 PCIe 141GB

GPU ArchitectureNVIDIA Hopper (GH100)

VRAM141 GB HBM3e

Memory Bandwidth4.8 TB/s

FP8 (Sparsity)3,958 TOPS

FP16 Tensor Core1,979 TFLOPS (with sparsity)

FP32 (CUDA)67 TFLOPS

TDP350W (configurable)

Form FactorPCIe — dual-slot, full-height full-length

InterfacePCIe 5.0 x16

NVLink600 GB/s (H200 NVL — NVLink bridge between 2 cards)

Multi-Instance GPUUp to 7 MIGs at 14GB each

Use case: Largest capacity PCIe GPU available — 141GB HBM3e at 4.8 TB/s memory bandwidth enables LLM inference with models up to 70B parameters (Llama 2 70B, Mistral 70B) on a single card. Up to 5x better LLM inference performance versus A100 per NVIDIA testing.

View full specs →

Hopper · GH100

H100 PCIe

NVIDIA H100 PCIe 80GB

GPU ArchitectureNVIDIA Hopper (GH100)

VRAM80 GB HBM2e

Memory Bandwidth2 TB/s

FP8 (Sparsity)3,341 TOPS

FP16 Tensor Core1,671 TFLOPS (with sparsity)

FP32 (CUDA)51.2 TFLOPS

TDP350W (configurable 310W–350W)

Form FactorPCIe — dual-slot, full-height full-length

InterfacePCIe 5.0 x16

NVLink600 GB/s (NVLink bridge for 2× H100 NVL)

Multi-Instance GPUUp to 7 MIGs at 10GB each

Use case: Enterprise AI training and inference standard — Transformer Engine with FP8 precision delivers 30x faster large language model inference versus A100. Fourth-generation Tensor Cores accelerate GPT, BERT, diffusion, and scientific simulation workloads.

View full specs →

Ampere · GA100

A100 PCIe 80GB

NVIDIA A100 PCIe 80GB

GPU ArchitectureNVIDIA Ampere (GA100)

VRAM80 GB HBM2e

Memory Bandwidth1,935 GB/s (1.9 TB/s)

FP16 Tensor Core312 TFLOPS (without sparsity)

BF16 Tensor Core312 TFLOPS

FP32 (CUDA)19.5 TFLOPS

Form FactorPCIe — dual-slot, full-height full-length

InterfacePCIe 4.0 x16

NVLink600 GB/s (with NVLink Bridge)

Multi-Instance GPUUp to 7 MIGs at 10GB each

Use case: Previous-generation standard — A100 PCIe remains widely deployed for AI inference, HPC, and deep learning training where H100 lead times or budget constraints apply. BF16 and TF32 Tensor Cores with 80GB VRAM handle large-model inference efficiently.

View full specs →

Ada Lovelace · AD102

L40S

AI + Visualisation

NVIDIA L40S 48GB

GPU ArchitectureNVIDIA Ada Lovelace (AD102)

VRAM48 GB GDDR6 ECC

Memory Bandwidth864 GB/s

FP8 (Sparsity)1,457 TOPS

FP32 (CUDA)91.6 TFLOPS

TF32 Tensor366 TFLOPS (without sparsity)

Form FactorPCIe — dual-slot, full-height full-length

InterfacePCIe 4.0 x16

OutputsNo display outputs (data centre headless)

RT Cores4th-generation (ray tracing + rasterisation)

Use case: Dual-purpose AI inference and professional visualisation — L40S handles generative AI workloads (image generation, video, text) alongside CAD, BIM, and VDI workloads on the same server. Ideal for creative industry studios and architects who need both compute and visual fidelity.

View full specs →

Ada Lovelace · AD102

RTX 6000 Ada

NVIDIA RTX 6000 Ada Generation

GPU ArchitectureNVIDIA Ada Lovelace (AD102)

VRAM48 GB GDDR6 ECC

Memory Bandwidth960 GB/s

FP32 (CUDA)91.1 TFLOPS

TF32 Tensor182.2 TFLOPS (without sparsity)

Form FactorPCIe — dual-slot, full-height full-length

InterfacePCIe 4.0 x16

Outputs4× DisplayPort 1.4a (workstation use)

CertificationISV-certified (Autodesk, Siemens, Dassault)

RT Cores4th-generation — real-time ray tracing

Use case: Professional workstation GPU — certified for Autodesk, SolidWorks, Siemens NX, and Dassault applications. 48GB GDDR6 ECC handles large scenes, complex simulations, and VR workflows that exceed the capacity of consumer-grade graphics cards.

View full specs →

Ampere · GA102

A40

NVIDIA A40 48GB

GPU ArchitectureNVIDIA Ampere (GA102)

VRAM48 GB GDDR6 ECC

Memory Bandwidth696 GB/s

FP32 (CUDA)37.4 TFLOPS

TF32 Tensor149.7 TFLOPS

Form FactorPCIe — dual-slot, passive (data centre)

InterfacePCIe 4.0 x16

OutputsNo display outputs (passive cooling)

Multi-Instance GPUNot supported (differs from A100)

Use casesVDI, AI inference, rendering, simulation

Use case: Previous-generation data centre GPU — A40 remains a cost-effective option for VDI (virtual desktop infrastructure), AI inference, and cloud graphics workloads where the GDDR6 memory footprint and lower cost per card are preferable to HBM-based alternatives.

View full specs →

NVIDIA data centre GPU comparison — H200, H100, L40S, A100, RTX 6000 Ada — memory, FP8 compute, TDP and primary use case

PCIe Add-In Card vs SXM System: What is the Difference?

PCIe Add-In Card (this page)

✓Installs into any standard PCIe x16 server slot
✓H100 PCIe: 80GB HBM2e · 2 TB/s · 350W
✓H200 PCIe NVL: 141GB HBM3e · 4.8 TB/s · 350W
✓Lower memory bandwidth than SXM variant
✓Standard data-centre air-cooled servers
✓More accessible and easier to deploy in existing infrastructure

SXM Module (dedicated GPU servers)

→Requires dedicated HGX board (Supermicro, DGX systems)
→H100 SXM: 80GB HBM3 · 3.35 TB/s · up to 700W
→H200 SXM: 141GB HBM3e · 4.8 TB/s · up to 700W
→NVLink 4.0 between all GPUs on HGX board (900 GB/s)
→Required for 8-GPU training at full bandwidth
→Higher performance — preferred for LLM training at scale

Related Products & Platforms

Need a GPU for AI inference, training, or visualisation?

This section is provided for technical reference and comparison only. Servnet does not currently supply these GPUs.

Full GPU Servers →

Talk to a UK specialist

Get expert advice or a no-obligation quote — servers, storage, networking, maintenance, finance and cloud. We reply the same working day.

or call 0800 987 4111