NVIDIA Ada Lovelace Architecture · AD102

NVIDIA L40S 48GB
AI Compute + Professional Visualisation

The NVIDIA L40S is the universal data-centre GPU — combining 1,466 FP8 TOPS of AI compute with 212 TFLOPS of ray-tracing performance in a single 48GB GDDR6 PCIe card. The reference GPU for NVIDIA OVX servers and multi-workload AI infrastructure.

All specifications from NVIDIA L40S datasheet. Listed for technical reference — Servnet does not currently supply this product.

All GPU Cards ←

Universal Data Centre GPU

L40S

48 GB GDDR6 ECC

AI (FP8)

1,466 TOPS

RT Cores

212 TFLOPS

FP32 Compute

91.6 TFLOPS

Ada Lovelace AD102 · 4th-gen Tensor Cores · 3rd-gen RT Cores

48 GB

GDDR6 ECC VRAM

864 GB/s

Memory bandwidth

1,466 TOPS

FP8 with sparsity

212 TFLOPS

RT Core performance

18,176

CUDA Cores

PCIe 4.0

x16 host interface

Technical Specifications

NVIDIA L40S 48GB — Full Datasheet Specifications

All figures from the official NVIDIA L40S datasheet. Tensor Core performance with sparsity assumes 2:4 structured sparse format. FP8 Transformer Engine requires compatible AI framework (TensorRT, PyTorch with Transformer Engine library).

Architecture

GPU ArchitectureNVIDIA Ada Lovelace (AD102)

CUDA Cores18,176

Tensor Cores568 (4th-generation)

RT Cores142 (3rd-generation)

Process NodeTSMC 4nm (NVIDIA 4N)

Memory

VRAM48 GB GDDR6 with ECC

Memory Bandwidth864 GB/s

Memory Bus Width384-bit

ECCYes — full ECC

Compute Performance

FP32 (CUDA)91.6 TFLOPS

RT Core Performance212 TFLOPS

TF32 Tensor Core183 TFLOPS (366 TFLOPS with sparsity)

BF16 Tensor Core362 TFLOPS (733 TFLOPS with sparsity)

FP16 Tensor Core362 TFLOPS (733 TFLOPS with sparsity)

FP8 Tensor Core733 TOPS (1,466 TOPS with sparsity)

INT8 Tensor Core733 TOPS (1,466 TOPS with sparsity)

Connectivity & Form

Form FactorPCIe — dual-slot, FHFL (full-height, full-length)

Host InterfacePCIe Gen4 x16 — 64 GB/s bidirectional

Display Outputs4× DisplayPort 1.4a (for workstation configurations)

Video Encode/Decode3× NVENC | 3× NVDEC (includes AV1 encode and decode)

Platform Features

Multi-Instance GPUNot supported (differs from A100/H100)

NVLinkNot supported (PCIe-connected only)

Transformer EngineYes — FP8/FP16 automatic precision switching

NEBS Level 3 ReadyYes — data centre deployment certified

Secure BootYes — root of trust (RoT)

TDP350W

Power Connector16-pin

CoolingPassive (requires system-level airflow)

Source: NVIDIA L40S GPU Datasheet (DS-10477-001). All trademarks are the property of NVIDIA Corporation.

L40S vs A100: Key Architectural Differences

NVIDIA L40S (Ada Lovelace)

✓GDDR6 memory — lower bandwidth (864 GB/s) but lower cost
✓FP8 Transformer Engine for LLM inference
✓RT Cores — hardware ray tracing for rendering
✓3× NVENC + 3× NVDEC (video acceleration)
✓NEBS Level 3 / secure boot
✓No MIG support — workload isolation via process separation

NVIDIA A100 (Ampere)

→HBM2e memory — higher bandwidth (1,935 GB/s)
→No FP8 — TF32 and BF16 Tensor Cores
→No RT Cores — pure compute GPU
→No display outputs / no NVENC hardware
→MIG partitioning — up to 7 isolated GPU instances
→NVLink support for multi-GPU communication

PCIe GPU Comparison

L40S vs A100 vs H100 PCIe

GPU Model	VRAM	Bandwidth	FP8 TOPS	PCIe Gen	TDP
THIS PAGEL40S 48GB	48GB GDDR6	864 GB/s	1,466 TOPS	4.0	350W
A100 PCIe 80GB	80GB HBM2e	1,935 GB/s	N/A	4.0	300W
H100 PCIe 80GB	80GB HBM2e	2,000 GB/s	3,341 TOPS	5.0	350W

Source: NVIDIA L40S datasheet (DS-10477-001), A100 (DS-10010-001), H100 PCIe (DS-10167-001).

Applications

L40S Key Workloads

🧠

Generative AI Inference

1,466 FP8 TOPS with Transformer Engine powers LLM inference (Llama 2 7B–13B), image generation (Stable Diffusion XL), and video AI workloads at data-centre scale. Up to 5× LLM inference throughput versus the previous-generation A40.

🖥️

Professional Visualisation & VDI

48GB GDDR6 and 3rd-generation RT Cores (212 TFLOPS ray tracing) handle real-time photorealistic rendering for CAD, BIM, architecture, and virtual production workflows. ISV-certified for Autodesk, Siemens, and Dassault applications.

🏭

NVIDIA OVX Infrastructure

The L40S is the reference GPU for NVIDIA OVX server configurations — designed for industrial AI, digital twins, and Omniverse workloads that combine physics simulation, path tracing, and AI inference on the same hardware.

🎬

Video Transcoding & Streaming

Three hardware NVENC and three NVDEC engines (including AV1) deliver enterprise-grade video encoding throughput — suitable for cloud gaming, media processing, video surveillance analytics, and broadcast encoding at scale.

⚡

LLM Training (Mid-Scale)

Fourth-generation Tensor Cores with FP8 Transformer Engine enable training of models up to 13B parameters on a single L40S, or larger models in multi-card configurations connected via PCIe fabric.

🔒

NEBS-Certified Deployment

L40S is NEBS (Network Equipment-Building System) Level 3 Ready — the standard for telecommunications and edge data centre deployment. Secure boot with root of trust enables security-hardened AI compute at the network edge.

Compare & Related Products

Need NVIDIA L40S for AI inference or visualisation?

Servnet supplies L40S for UK enterprise deployments — AI inference, VDI, NVIDIA OVX, and media production. We advise on server compatibility (PCIe 4.0 x16, passive cooling, 350W PSU) and provide current UK lead time and availability.

View All GPU Cards →

NVIDIA L40S 48GBAI Compute + Professional Visualisation