- Huawei wants UB-Mesh to unify fragmented interconnect standards across massive AI clusters
- UB-Mesh design blends CLOS backbone with multidimensional rack-level meshes for scalability
- Traditional interconnects grow too expensive at large-scale deployments
Huawei has revealed plans to open source its UB-Mesh interconnect, a system aimed at unifying how processors, memory, and networking equipment communicate across massive AI data centers.
The UB-Mesh design combines a CLOS-based backbone at the data hall level with multi-dimensional meshes inside each rack.
By combining these topologies, Huawei claims it can keep costs under control even as system sizes scale into tens of thousands of nodes. It also hopes to solve the issue of scaling AI workloads, where latency and hardware failures pose barriers.
Replacing fragmented standards with a single framework
The move is pitched as a way to replace multiple overlapping standards with a single framework, potentially reshaping how large-scale computing infrastructure is built and operated.
In simple terms, Huawei wants to replace today’s mix of different connection rules with one universal system, so everything links together more easily and cheaply.
“Next month we have a conference, where we are going to announce that the UB-Mesh protocol will be published and disclosed to anybody like a free license,” said Heng Liao, chief scientist of HiSilicon, Huawei’s processor arm.
“This is a very new technology; we are seeing competing standardization efforts from different camps. […] Depending on how successful we are in deploying actual systems and demand from partners and customers, we can talk about turning it into some kind of standard.”
One of the central arguments behind UB-Mesh is that traditional interconnects grow too expensive at scale, eventually costing more than the accelerators they are meant to connect.
Huawei points to its own demonstrations, where an 8,192-node deployment was used as evidence that costs do not need to rise linearly.
This is framed as essential for the future of AI systems built with millions of processors, high-speed networking devices, and massive storage arrays such as the largest SSD systems used in cloud storage operations.
UB-Mesh is part of a broader idea Huawei calls the SuperNode. This refers to a data center-scale cluster where CPUs, GPUs, memory, SSD units, and switches can all operate as if they were inside a single machine.
Bandwidth claims of over one terabyte per second per device and sub-microsecond latency are being positioned as proof that the concept is not only possible but necessary for next-generation computing.
However, standards like PCIe, NVLink, UALink, and Ultra Ethernet already have backing from multiple companies across the semiconductor and networking industries.
The question now is whether the industry will accept a new Huawei-backed protocol or continue favoring standards already supported by a wider range of companies.
Huawei’s proposal, while ambitious, places customers in the position of adopting a protocol owned and controlled by one supplier.
Even with open-source licensing, there are concerns about long-term interoperability, governance, and geopolitical risks.
That said, Huawei’s technical potential sounds impressive, but its move demands a degree of industry-wide trust and adoption that it has yet to secure.
Via Toms Hardware
Add Comment