NVIDIA L40S 48GB — Full Datasheet Specifications
All figures from the official NVIDIA L40S datasheet. Tensor Core performance with sparsity assumes 2:4 structured sparse format. FP8 Transformer Engine requires compatible AI framework (TensorRT, PyTorch with Transformer Engine library).
Source: NVIDIA L40S GPU Datasheet (DS-10477-001). All trademarks are the property of NVIDIA Corporation.
L40S vs A100: Key Architectural Differences
- ✓GDDR6 memory — lower bandwidth (864 GB/s) but lower cost
- ✓FP8 Transformer Engine for LLM inference
- ✓RT Cores — hardware ray tracing for rendering
- ✓3× NVENC + 3× NVDEC (video acceleration)
- ✓NEBS Level 3 / secure boot
- ✓No MIG support — workload isolation via process separation
- →HBM2e memory — higher bandwidth (1,935 GB/s)
- →No FP8 — TF32 and BF16 Tensor Cores
- →No RT Cores — pure compute GPU
- →No display outputs / no NVENC hardware
- →MIG partitioning — up to 7 isolated GPU instances
- →NVLink support for multi-GPU communication
L40S vs A100 vs H100 PCIe
Source: NVIDIA L40S datasheet (DS-10477-001), A100 (DS-10010-001), H100 PCIe (DS-10167-001).
L40S Key Workloads
Generative AI Inference
1,466 FP8 TOPS with Transformer Engine powers LLM inference (Llama 2 7B–13B), image generation (Stable Diffusion XL), and video AI workloads at data-centre scale. Up to 5× LLM inference throughput versus the previous-generation A40.
Professional Visualisation & VDI
48GB GDDR6 and 3rd-generation RT Cores (212 TFLOPS ray tracing) handle real-time photorealistic rendering for CAD, BIM, architecture, and virtual production workflows. ISV-certified for Autodesk, Siemens, and Dassault applications.
NVIDIA OVX Infrastructure
The L40S is the reference GPU for NVIDIA OVX server configurations — designed for industrial AI, digital twins, and Omniverse workloads that combine physics simulation, path tracing, and AI inference on the same hardware.
Video Transcoding & Streaming
Three hardware NVENC and three NVDEC engines (including AV1) deliver enterprise-grade video encoding throughput — suitable for cloud gaming, media processing, video surveillance analytics, and broadcast encoding at scale.
LLM Training (Mid-Scale)
Fourth-generation Tensor Cores with FP8 Transformer Engine enable training of models up to 13B parameters on a single L40S, or larger models in multi-card configurations connected via PCIe fabric.
NEBS-Certified Deployment
L40S is NEBS (Network Equipment-Building System) Level 3 Ready — the standard for telecommunications and edge data centre deployment. Secure boot with root of trust enables security-hardened AI compute at the network edge.
Compare & Related Products
Need NVIDIA L40S for AI inference or visualisation?
Servnet supplies L40S for UK enterprise deployments — AI inference, VDI, NVIDIA OVX, and media production. We advise on server compatibility (PCIe 4.0 x16, passive cooling, 350W PSU) and provide current UK lead time and availability.
