Two clusters of jargon dominate any AI data-centre conversation: how the heat gets out, and how the network stays lossless. Air, direct liquid and immersion cooling sit on one axis; RoCE and RDMA on the other. Buyers nod along and then quietly wonder what they have agreed to. This explainer decodes the cooling options and the lossless-networking terms in plain English, so you can tell which your AI build actually needs and which is someone else's problem.
Air, direct liquid and immersion: three ways to remove heat
Air cooling moves heat with fans and chilled air, and it remains perfectly adequate for most general-purpose servers and lighter GPU configurations. It is simple, well understood and needs no special facility plumbing. Its limit is density: beyond a certain power per rack, air can no longer carry the heat away fast enough, which is exactly the wall dense GPU servers hit.
Direct liquid cooling brings coolant to the hottest components through cold plates, carrying far more heat than air and enabling the high power densities that large GPU servers demand. Immersion cooling goes further again, submerging whole servers in a non-conductive fluid so every component is cooled at once, reaching the highest densities but requiring the most specialised facility design. The progression is air to direct liquid to immersion as density rises. Our AI server cooling guide covers the technology choice in depth.
Which cooling does your build need?
The honest answer for most buyers is less exotic than the marketing suggests. A handful of inference GPUs in a mainstream server are usually fine on air. Direct liquid becomes relevant when you deploy dense GPU servers whose power per rack exceeds what air can handle, which is the typical trigger for training-class hardware. Immersion is a deliberate, facility-level decision for the very highest densities, not something most organisations reach for first.
The trap is buying cooling complexity you do not need, or specifying dense hardware without checking the room can cool it. Match the cooling to the power density you are actually deploying, and treat immersion as a considered facility strategy rather than a default. The cooling method and the hardware density decision belong together.
RoCE and RDMA: keeping the AI network lossless
On the networking side, the goal for AI training is moving data between GPUs with very low latency and no loss, because the GPUs spend much of their time exchanging results and stalls are expensive. RDMA, remote direct memory access, lets one machine read and write another's memory directly, bypassing the operating system and the CPU overhead of ordinary TCP/IP networking. That is what keeps inter-GPU communication fast.
RoCE, RDMA over Converged Ethernet, runs RDMA across standard Ethernet, which lets organisations get low-latency, lossless behaviour on the Ethernet fabric they already understand rather than a separate specialised network. The catch is that lossless Ethernet must be configured correctly, with the right congestion control, or the loss it is meant to avoid creeps back in. Our network card guidance covers the adapters that support it.
- •Air cooling: simple, adequate for general servers and light GPU loads
- •Direct liquid: cold plates on hot components for dense GPU servers
- •Immersion: whole servers in fluid for the highest densities, facility-level decision
- •RDMA: direct memory-to-memory transfer, bypassing CPU and OS overhead
- •RoCE: RDMA over standard Ethernet - lossless if configured correctly
Putting the terms to work
Decoded, these terms let you read an AI proposal critically. If someone specifies immersion cooling for a few inference GPUs, question it; if they specify dense training hardware on plain air, question that too. If a training cluster is described without any mention of RDMA or RoCE, ask how the GPUs communicate. Match cooling to density and fabric to the communication pattern, and bring the build to our on-prem AI cluster guide to size both correctly.