UK’s trusted IT infrastructure partner since 2003
sales@servnetuk.com
0800 987 4111
Servnet
ConfiguratorGet in Touch
NVIDIA Ada Lovelace Architecture · AD102

NVIDIA L40S 48GB
AI Compute + Professional Visualisation

The NVIDIA L40S is the universal data-centre GPU — combining 1,466 FP8 TOPS of AI compute with 212 TFLOPS of ray-tracing performance in a single 48GB GDDR6 PCIe card. The reference GPU for NVIDIA OVX servers and multi-workload AI infrastructure.

All specifications from NVIDIA L40S datasheet. No pricing displayed — contact Servnet for UK availability.

NVIDIA
Universal Data Centre GPU
L40S
48 GB GDDR6 ECC
AI (FP8)
1,466 TOPS
RT Cores
212 TFLOPS
FP32 Compute
91.6 TFLOPS
Ada Lovelace AD102 · 4th-gen Tensor Cores · 3rd-gen RT Cores
48 GB
GDDR6 ECC VRAM
864 GB/s
Memory bandwidth
1,466 TOPS
FP8 with sparsity
212 TFLOPS
RT Core performance
18,176
CUDA Cores
PCIe 4.0
x16 host interface
Technical Specifications

NVIDIA L40S 48GB — Full Datasheet Specifications

All figures from the official NVIDIA L40S datasheet. Tensor Core performance with sparsity assumes 2:4 structured sparse format. FP8 Transformer Engine requires compatible AI framework (TensorRT, PyTorch with Transformer Engine library).

Architecture
GPU ArchitectureNVIDIA Ada Lovelace (AD102)
CUDA Cores18,176
Tensor Cores568 (4th-generation)
RT Cores142 (3rd-generation)
Process NodeTSMC 4nm (NVIDIA 4N)
Memory
VRAM48 GB GDDR6 with ECC
Memory Bandwidth864 GB/s
Memory Bus Width384-bit
ECCYes — full ECC
Compute Performance
FP32 (CUDA)91.6 TFLOPS
RT Core Performance212 TFLOPS
TF32 Tensor Core183 TFLOPS (366 TFLOPS with sparsity)
BF16 Tensor Core362 TFLOPS (733 TFLOPS with sparsity)
FP16 Tensor Core362 TFLOPS (733 TFLOPS with sparsity)
FP8 Tensor Core733 TOPS (1,466 TOPS with sparsity)
INT8 Tensor Core733 TOPS (1,466 TOPS with sparsity)
Connectivity & Form
Form FactorPCIe — dual-slot, FHFL (full-height, full-length)
Host InterfacePCIe Gen4 x16 — 64 GB/s bidirectional
Display Outputs4× DisplayPort 1.4a (for workstation configurations)
Video Encode/Decode3× NVENC | 3× NVDEC (includes AV1 encode and decode)
Platform Features
Multi-Instance GPUNot supported (differs from A100/H100)
NVLinkNot supported (PCIe-connected only)
Transformer EngineYes — FP8/FP16 automatic precision switching
NEBS Level 3 ReadyYes — data centre deployment certified
Secure BootYes — root of trust (RoT)
TDP350W
Power Connector16-pin
CoolingPassive (requires system-level airflow)

Source: NVIDIA L40S GPU Datasheet (DS-10477-001). All trademarks are the property of NVIDIA Corporation.

L40S vs A100: Key Architectural Differences

NVIDIA L40S (Ada Lovelace)
  • GDDR6 memory — lower bandwidth (864 GB/s) but lower cost
  • FP8 Transformer Engine for LLM inference
  • RT Cores — hardware ray tracing for rendering
  • 3× NVENC + 3× NVDEC (video acceleration)
  • NEBS Level 3 / secure boot
  • No MIG support — workload isolation via process separation
NVIDIA A100 (Ampere)
  • HBM2e memory — higher bandwidth (1,935 GB/s)
  • No FP8 — TF32 and BF16 Tensor Cores
  • No RT Cores — pure compute GPU
  • No display outputs / no NVENC hardware
  • MIG partitioning — up to 7 isolated GPU instances
  • NVLink support for multi-GPU communication
PCIe GPU Comparison

L40S vs A100 vs H100 PCIe

GPU ModelVRAMBandwidthFP8 TOPSPCIe GenTDP
THIS PAGEL40S 48GB
48GB GDDR6864 GB/s1,466 TOPS4.0350W
A100 PCIe 80GB
80GB HBM2e1,935 GB/sN/A4.0300W
H100 PCIe 80GB
80GB HBM2e2,000 GB/s3,341 TOPS5.0350W

Source: NVIDIA L40S datasheet (DS-10477-001), A100 (DS-10010-001), H100 PCIe (DS-10167-001).

Applications

L40S Key Workloads

🧠

Generative AI Inference

1,466 FP8 TOPS with Transformer Engine powers LLM inference (Llama 2 7B–13B), image generation (Stable Diffusion XL), and video AI workloads at data-centre scale. Up to 5× LLM inference throughput versus the previous-generation A40.

🖥️

Professional Visualisation & VDI

48GB GDDR6 and 3rd-generation RT Cores (212 TFLOPS ray tracing) handle real-time photorealistic rendering for CAD, BIM, architecture, and virtual production workflows. ISV-certified for Autodesk, Siemens, and Dassault applications.

🏭

NVIDIA OVX Infrastructure

The L40S is the reference GPU for NVIDIA OVX server configurations — designed for industrial AI, digital twins, and Omniverse workloads that combine physics simulation, path tracing, and AI inference on the same hardware.

🎬

Video Transcoding & Streaming

Three hardware NVENC and three NVDEC engines (including AV1) deliver enterprise-grade video encoding throughput — suitable for cloud gaming, media processing, video surveillance analytics, and broadcast encoding at scale.

LLM Training (Mid-Scale)

Fourth-generation Tensor Cores with FP8 Transformer Engine enable training of models up to 13B parameters on a single L40S, or larger models in multi-card configurations connected via PCIe fabric.

🔒

NEBS-Certified Deployment

L40S is NEBS (Network Equipment-Building System) Level 3 Ready — the standard for telecommunications and edge data centre deployment. Secure boot with root of trust enables security-hardened AI compute at the network edge.

Compare & Related Products

Need NVIDIA L40S for AI inference or visualisation?

Servnet supplies L40S for UK enterprise deployments — AI inference, VDI, NVIDIA OVX, and media production. We advise on server compatibility (PCIe 4.0 x16, passive cooling, 350W PSU) and provide current UK lead time and availability.

Request L40S QuoteView All GPU Cards →