AMD Helios MI455X Faces Performance Challenges with Ethernet Connectivity
AMD unveils its next-gen AI rack-scale system, Helios MI455X, at Computex 2026. Initial models rely on UALink-over-Ethernet, raising latency concerns compared to dedicated interconnects. A true UALink version is in the works.
AMD unveiled its next-generation AI rack-scale system, “Helios,” on June 4 at Computex 2026 in Taipei. The system, which can host up to 72 Instinct MI455X AI accelerators, is designed to compete with Nvidia’s Vera Rubin-based NVL72 VR200. According to a report by Tom’s Hardware, the initial shipment of Helios will use UALink-over-Ethernet for scale-up connectivity, which could limit performance for certain workloads.
Overview of Helios
Helios marks AMD’s first foray into rack-scale AI systems. Combining sixth-generation EPYC “Venice” CPUs (up to 256 cores) with 72 Instinct MI455X accelerators, the system delivers a total of 31TB of HBM4 memory and a memory bandwidth of 1400TB/s. AMD estimates its FP4 dense matrix compute performance at approximately 2900 PFLOPS. While this falls short of Nvidia’s VR200 NVL72, Helios offers a significant advantage in HBM4 memory capacity, making it well-suited for memory-intensive workloads like running large language models (LLMs).
For interconnects, AMD employs its proprietary UALink (Ultra Accelerator Link). However, initial systems will utilize UALink-over-Ethernet, which provides an aggregate scale-up bandwidth of up to 260TB/s—on par with Nvidia’s NVL72 VR200. Helios also features Pensando Vulcano network interface cards (NICs), one of the industry’s first 800GbE-capable cards, compliant with the Ultra Ethernet specification and offering up to 43TB/s of scale-out bandwidth.
Challenges with Interconnects
The main reason Helios initially employs UALink-over-Ethernet is that UALink switches are still in the final stages of development, testing, and certification. Only after these processes are complete—along with validation by AMD’s AI customers—will the true UALink interconnect be rolled out.
The primary advantage of using Ethernet-based connectivity is the ability to leverage existing infrastructure. Ethernet switching ASICs, cables, and other components are already deployed globally by hyperscalers and cloud providers. This allows AMD to accelerate Helios’ deployment while benefiting from the reliability and ease of sourcing validated components.
However, Ethernet has inherent drawbacks. Designed as a general-purpose networking technology, Ethernet was not built with AI accelerator scaling in mind. Consequently, it incurs higher communication latency, increased protocol overhead, and less deterministic performance compared to dedicated scale-up fabrics. For large-scale AI training jobs, where all 72 MI455X accelerators must operate in unison, these non-deterministic behaviors could negatively impact training efficiency.
Differences Between True UALink and
UALink-over-Ethernet
UALink is an open, high-speed interconnect standard promoted by AMD through industry consortia. It offers higher bandwidth and lower latency than PCI Express, optimizing direct communication between GPUs and AI accelerators. True UALink uses dedicated switches and protocols, reducing overhead and ensuring deterministic latency compared to Ethernet.
In contrast, UALink-over-Ethernet encapsulates UALink protocols within Ethernet frames. While this approach allows the reuse of existing Ethernet infrastructure, it is more prone to packet processing overhead and jitter. AMD plans to support both modes, allowing customers to gradually transition to true UALink as the required switches become available.
The reliance on UALink-over-Ethernet in early systems reflects the current developmental stage of the UALink ecosystem. Although the interconnect standard is being established, mass production and validation of switch hardware will take more time. Hyperscalers, AMD’s primary customers, also need to conduct extensive in-house testing before deploying the system in production environments.
Market Impact and Competitive Landscape
Helios is positioned as a direct competitor to Nvidia’s Vera Rubin-based NVL72 VR200. Nvidia has long held an advantage with its proprietary NVLink and NVSwitch tightly coupled interconnects. Although AMD aims to challenge this dominance with UALink, its initial dependence on Ethernet could pose a competitive disadvantage.
In terms of compute performance, Helios’ 2900 FP4 PFLOPS lags behind Nvidia’s VR200. However, its 31TB of HBM4 memory provides a significant advantage for memory-intensive workloads. For tasks like training and inference with massive models such as GPT-4, the ability to store model parameters in memory can be a decisive factor. AMD is emphasizing this advantage to attract customers who prioritize memory capacity.
Interconnect performance directly affects the efficiency of distributed model training. In large-scale training scenarios employing tensor and pipeline parallelism, GPU-to-GPU communication latency can bottleneck overall training speed. Using UALink-over-Ethernet could lead to higher communication overhead, potentially increasing training times for large models. However, for inference workloads, where memory capacity is more critical than communication speed, Helios may find its niche.
AMD’s partner companies showcased Helios prototypes at Computex 2026, with plans to begin shipments in late 2026. Early customers will receive UALink-over-Ethernet models, with an option to upgrade to true UALink in the future.
Editorial Perspective
In the short term, the reliance on UALink-over-Ethernet presents a clear limitation for AMD’s AI data center strategy. While Nvidia’s NVL72 offers stable performance through its proprietary NVLink, Helios must contend with the uncertainties of Ethernet-based connectivity. This may incentivize hyperscalers, who are making significant AI investments, to favor Nvidia for its predictable performance. However, the large HBM4 memory capacity offers a practical solution for LLM operators facing memory bottlenecks, potentially driving initial adoption for inference-centric use cases.
In the long run, the maturity of the UALink ecosystem and the availability of true UALink switches will be critical. As an open standard, UALink has the potential to attract participation from other accelerator vendors, including Intel, creating an open alternative to Nvidia’s NVLink. If UALink becomes an industry standard, it could diversify the AI infrastructure supply chain. Over the next one to three years, factors such as Nvidia’s post-Vera Rubin platform strategy and the progress of the Ultra Ethernet Consortium will heavily influence Helios’ market reception.
The editorial team is particularly interested in whether Ethernet-based scale-up connectivity can deliver practical performance. Benchmark results will reveal the gap between Ethernet and dedicated interconnects. Additionally, the timeline for UALink switch completion and AMD’s ability to provide a smooth upgrade path for Helios will be decisive factors in customers’ purchasing decisions.
References
- Tom’s Hardware: AMD’s Helios MI455X AI platform breaks cover — Published June 4, 2026
- AMD Official Press Release (Computex 2026 Coverage, updated regularly)
Frequently Asked Questions
- What is Helios?
- Helios is AMD's next-generation rack-scale AI system, featuring up to 72 Instinct MI455X accelerators and EPYC Venice CPUs with a total of 31TB HBM4 memory. It is designed to compete with Nvidia's NVL72 VR200.
- What are the drawbacks of UALink-over-Ethernet?
- Since Ethernet is a general-purpose networking technology, it has higher latency, greater protocol overhead, and less deterministic performance compared to dedicated scale-up fabrics. These issues could hamper training efficiency for large-scale AI workloads.
- When will the true UALink version be available?
- The true UALink version will be released after UALink switches are finalized and validated by customers. While no specific timeline has been announced, AMD plans to first ship UALink-over-Ethernet systems and then provide an upgrade path to true UALink.
Comments