Fundamental switches are a critical part of the AI revolution

The rapid adoption of artificial intelligence (AI) in many different aspects of our lives is placing huge demands on semiconductor manufacturing technology. Training the large language models (LLMs) being used daily requires huge arrays of ultra-high-performance GPUs. This growth trend is also placing unprecedented demand on the energy infrastructure, as data centers are expected to consume up to 40 times more power by 2030.

With each new generation of GPU technology, the interconnections linking GPUs to other key components in an AI server must operate at increasingly higher speeds. Today’s high-end GPUs use PCI Express 6.0 (PCIe 6.0), which supports transfer rates of up to 64 Gbps. To reach these speeds, PCIe 6.0 moves beyond binary signaling and instead uses four voltage levels to encode data.

Although manufacturing such devices is incredibly complex, some of the biggest challenges faced by leading GPU manufacturers such as Nvidia relate to testing the devices they make. Complex boards are required that use switches to configure the different validation processes. The extreme speeds and use of multiple voltage levels means that even minor distortions introduced by these boards and switches can slow validation and delay deployment.

Electromechanical Relays: A 200-Year-Old Solution

Two different solutions are used to switch signals when testing modern semiconductors. The first, electromechanical (EM) relays, relies on a technology invented by Joseph Henry in 1835 that is still in use today. Much like a conventional switch, two metal contacts are brought together - or separated -- to make or break a physical connection. A spring mechanism with electromagnet control allows the connection state, i.e., on or off, to be controlled electronically.

EM relays have been around for almost 200 years and offer certain advantages due to their metal-to-metal contact. Yet their mechanical design has inherent limitations. On high-speed interconnects, EM relays introduce distortion that degrades signal quality, and their mechanical switching is slow, creating bottlenecks. Over time, repeated switching leads to wear in the physical contact, which increases resistance and further slows their operation, making them a poor fit for high-volume, high-speed test systems.

Semiconductor Switches: Faster but Imperfect

Switches made from semiconductors, on the other hand, offer a different approach that overcomes the limited switching speed of EM relays. However, they do not transmit the high-speed signals as well as the metal-to-metal contact of an EM relay, which could mean that although a device may function well within specification, it gets rejected because the test system itself distorts the signal being carried beyond acceptable limits. Silicon-based switches can also cause signal loss, making testing even more challenging. As interfaces reach and exceed, the speed limits of PCIe 6.0, these drawbacks become so significant that an alternative solution becomes essential.

A technology that combines the best of the semiconductor switch and EM relay would enable reliable testing of high-speed buses in minimal time with high reliability. Ideally, you’d want to combine the reliable, metal-to-metal contact of relays with the ruggedness and manufacturing economies of semiconductor devices.

A New Technology: Tiny, Fast & Reliable Switches

Companies such as NVIDIA have turned to the Ideal Switch® to test their GPUs with modern high-performance interfaces. This technology builds tiny, fast and reliable switches using a process that is very similar to manufacturing silicon chips, with a metal-to-metal switch contact.

This technology breakthrough was enabled by the creation of a proprietary alloy used to construct a miniaturized switch that has negligible resistance and can operate reliably for billions of cycles. Operated by an electrostatic charge, the device can switch in less than 10us, enabling the test system to reconfigure quickly, which substantially reduces the time it takes to test each device.

Unlike semiconductor switches, the Ideal Switch is able to transmit signals from DC to more than 50GHz, minimizing distortion of the signals being tested and providing ample headroom for testing faster future generations of PCIe. The device introduces minimal insertion loss and reduces signal distortion, further mitigating –- or in most cases eliminating -- existing testing challenges.

By leveraging manufacturing methods very similar to those used in silicon semiconductor production, this new switch can be produced cost-effectively at scale. In a similar way to conventional semiconductors, this approach enables high levels of integration. Multiple switches can be combined into a single multichip module to perform loopback and routing functions needed to test the latest GPUs. The reduced footprint also helps preserve the integrity of very high-speed signals, which can otherwise be degraded by simply travelling a short distance across a printed circuit board.

Beyond supporting production testing of modern AI and  xPU devices, the creators of the Ideal Switch have deployed the same technology to solve a similarly challenging problem in electronics. i.e., switching RF signals. In many applications, from satellite communications to quantum computers, it is necessary to switch radio-frequency signals without degrading the signal. The Ideal Switch offers the perfect solution:  Unlike semiconductor switches, it has minimal resistance and preserves the signal, even when operating at extremely low temperatures.

As AI adoption accelerates and GPUs continue to increase in performance, both power consumption and power density in data centers are rising rapidly. To safeguard these expensive assets in tightly-packed racks, compact protection systems are required. Conventional technologies, such as EM relays and semiconductor switches face inherent limitations that are similar to those seen in GPU testing. The power infrastructure also must be more efficient, faster and miniaturized. Mechanical switches are too slow to respond to rapidly changing load conditions and too large to fit. Semiconductor switches offer speed but introduce resistance, causing significant power loss, inefficiency and heat generation that adds to an already demanding thermal management solution.

Although switches are a technology that is as old as the use of electricity, without innovation, today’s critical applications will be held back. The Ideal Switch builds on existing semiconductor manufacturing processes to create a solution that enable the deployment of systems – from AI to satellite communications – that we rely on every day.

Russell Garcia is CEO of Menlo Microsystems. He has led the commercialization of MEMS switch technology across RF, digital and power systems over his 30-year leadership career.