AI Infrastructure · Trends

Ultra Ethernet vs UALink vs InfiniBand: the AI interconnect standards to watch

Servnet Editorial · Server Infrastructure Practice20 January 202613 min read

When you build an AI cluster, the network between the GPUs matters almost as much as the GPUs themselves - a fast accelerator starved of interconnect bandwidth sits idle waiting for its peers. For years that fabric meant InfiniBand, largely from one vendor. Now two open standards are mounting a serious challenge: Ultra Ethernet, which re-engineers Ethernet for AI, and UALink, which targets tight accelerator-to-accelerator links. For UK teams planning AI infrastructure, this is a standards-watch moment - the choices are no longer just InfiniBand or nothing, and the fabric you pick shapes scale, cost and vendor lock-in for years. Here is how the three compare.

UEC / UALink fabric vs InfiniBand

Why the interconnect decides AI cluster performance

Training and large-scale inference spread a single job across many GPUs that must constantly exchange gradients and activations. The collective operations behind that - all-reduce, all-to-all - are intensely network-bound, and any latency or congestion in the fabric directly idles expensive accelerators. As clusters grow from a handful of nodes to hundreds, the interconnect becomes the dominant factor in how much useful work the GPUs actually do, which is why fabric choice is a first-order decision, not an afterthought.

Two different jobs are involved, and the new standards split along that line. There is scale-up - linking accelerators tightly within and across a few nodes so they behave almost like one big device - and scale-out - connecting many nodes into a large cluster. Historically InfiniBand stretched across both; the new standards are more specialised, and understanding which job each targets is the key to reading the landscape. The fundamentals of building such a cluster are in building your first UK on-prem AI cluster.

Ultra Ethernet: re-engineering Ethernet for AI scale-out

Ultra Ethernet, developed by the Ultra Ethernet Consortium, takes the ubiquitous, multi-vendor Ethernet ecosystem and re-works it specifically for AI and HPC scale-out. Standard Ethernet was not designed for the bursty, loss-sensitive collective traffic of large training jobs; Ultra Ethernet adds modern congestion control, more efficient packet handling and better support for the collective patterns AI workloads generate, aiming to match the performance characteristics that made InfiniBand attractive while keeping Ethernet's open, competitive supplier base.

The strategic appeal for buyers is the ecosystem. Ethernet has many switch and NIC vendors, deep operational familiarity in most organisations, and pricing shaped by competition rather than a single supplier. If Ultra Ethernet delivers on AI-scale performance, it offers a path to large clusters without committing to one vendor's fabric - which is exactly why so much of the industry has lined up behind it. Choose AI-capable NICs with our network cards guidance.

•Ultra Ethernet: open multi-vendor scale-out fabric, AI-tuned Ethernet - broad ecosystem, competitive pricing
•UALink: open scale-up accelerator-to-accelerator interconnect - tight, low-latency GPU linking
•InfiniBand: mature, high-performance, largely single-vendor - proven but more locked-in

UALink: open scale-up for accelerator-to-accelerator

UALink addresses the other half of the problem. Backed by a broad industry consortium, it is an open standard for the tight, low-latency, high-bandwidth links between accelerators - the scale-up domain where GPUs need to share memory and results almost as if they were a single device. This is the space proprietary accelerator interconnects have owned, and UALink's pitch is to provide an open, multi-vendor alternative so customers are not locked into one accelerator vendor's fabric for the most performance-critical links.

The two standards are therefore complementary rather than competing: UALink for scale-up within the accelerator pod, Ultra Ethernet for scale-out across the cluster. A future AI build could plausibly use both - UALink binding the GPUs tightly together and Ultra Ethernet stitching the pods into a large fabric - which is a very different picture from one vendor's InfiniBand spanning everything. Plan the accelerators alongside the fabric with our GPU and accelerators guidance.

Ultra Ethernet vs UALink vs InfiniBand

View the data behind this chart

Ultra Ethernet vs UALink vs InfiniBand
	Ultra Ethernet	UALink	InfiniBand
Role	Scale-out	Scale-up	Both
Scale	Very large	Pod / node	Very large
Latency	Low (AI-tuned)	Lowest	Very low
Ecosystem	Open, broad	Open, growing	Largely single
Maturity	Emerging	Emerging	Proven

InfiniBand: the incumbent it has to beat

InfiniBand is not going away - it is mature, very high performance, and has the deepest software and operational track record for large-scale training, which is exactly why it remains the default for the biggest clusters today. Its weaknesses from a buyer's standpoint are commercial rather than technical: it is largely a single-vendor ecosystem, which means pricing power, supply dependence and a smaller pool of operational expertise outside specialist shops.

So the standards-watch question is not whether InfiniBand works - it plainly does - but whether the open alternatives mature fast enough, and perform close enough, to make the openness worth it. For most UK organisations building their first or second AI cluster, that maturity timeline is the thing to track, because committing to a fabric is a multi-year, hard-to-reverse decision. This is distinct from the separate question of switch port speeds; here the issue is the fabric architecture and its ecosystem, not whether the ports are 400G or 800G.

What a UK buyer should do now

Do not bet a near-term cluster on a standard that is not yet shipping in the form you need - but do build with the transition in mind. For an AI cluster being procured today, weigh InfiniBand's proven maturity against the open standards' trajectory, and favour designs and suppliers that keep an Ethernet-based path open, because the momentum and ecosystem behind Ultra Ethernet make it the likeliest long-term mainstream. For scale-up links, watch UALink's maturation as the open alternative to proprietary accelerator fabrics.

The honest near-term answer is workload- and scale-dependent, and the safest posture is to avoid unnecessary lock-in while the standards settle. We design AI fabrics with that openness in mind - read the build fundamentals in building your first UK on-prem AI cluster and choose the NICs with our network cards guidance.

Key takeaways

✓The GPU-to-GPU fabric is a first-order AI decision - a fast accelerator starved of interconnect sits idle.
✓Ultra Ethernet re-engineers open, multi-vendor Ethernet for AI scale-out - broad ecosystem, competitive pricing.
✓UALink is an open scale-up standard for tight accelerator-to-accelerator links, challenging proprietary fabrics.
✓Ultra Ethernet and UALink are complementary (scale-out vs scale-up), not direct rivals.
✓InfiniBand remains the proven incumbent; the watch is whether the open standards mature fast enough to be worth the reduced lock-in.

Frequently asked

FAQs — Ultra Ethernet vs UALink vs InfiniBand

Choosing a fabric

Ultra Ethernet or InfiniBand for an AI cluster?

InfiniBand is the proven incumbent for large-scale training; Ultra Ethernet offers an open, multi-vendor, AI-tuned Ethernet path with competitive pricing as it matures. For a cluster procured now, weigh InfiniBand's maturity against keeping an Ethernet path open. We design fabrics to avoid lock-in - see building an on-prem AI cluster.

Are Ultra Ethernet and UALink competitors?

No - they are complementary. UALink targets scale-up: tight, low-latency links between accelerators within a pod. Ultra Ethernet targets scale-out: connecting many nodes into a large cluster. A future build could use both. Plan accelerators alongside the fabric with our GPU and accelerators guidance.

Timing

Should I wait for these standards before building?

Do not bet a near-term cluster on a standard not yet shipping in the form you need. Build with the transition in mind: favour suppliers and designs that keep an Ethernet-based path open. The maturity timeline is the thing to track - we monitor it and choose NICs accordingly via network cards.

Server network cards →GPU & accelerators →Building your first UK on-prem AI cluster →NVLink vs. InfiniBand vs. Ethernet: GPU Fabrics Explained (2026) →

Got a question this article didn't answer?

One conversation with an engineer who's done this before. No sales script.

Talk to Servnet →

Talk to a UK specialist

Get expert advice or a no-obligation quote — servers, storage, networking, maintenance, finance and cloud. We reply the same working day.

or call 0800 987 4111