Here’s Why AMD Can’t Catch Up Soon to NVIDIA’s AI

They say it's harder to stay on top than to make it to the top of any industry. In the world of artificial intelligence (AI) powered supercomputers, the undisputed lion on top of the mountain is NVIDIA Co. (NASDAQ: NVDA), with an estimated 94% market share of AI chips or GPUs. While there are plenty of competitors like Marvell Technology Inc. (NASDAQ: MRVL) when it comes to application-specific integrated circuits (ASICs) for AI, there are few viable competitors for NVIDIA’s dominant AI chipsets in the computer and technology sector.

AMD Challenges the Incumbent

Applied Micro Devices Inc. (NASDAQ: AMD) has come a long way, competing with NVIDIA with graphic processing units (GPUs) used for PC gaming and bitcoin mining. AMD believes the deployment of its new MI325X AI accelerator chips can give NVIDIA’s Blackwell GPUs a run for their money with data centers.

Data centers require massive processing power and have been ordering massive amounts of NVIDIA GPUs, causing their business to surge in the triple digits while putting new orders on wait lists at least a year out. AMD hopes to entice more customers who need AI chips as an alternative without the wait. AMD has suggested that Microsoft and Meta Platforms have purchased some AI GPUs along with OpenAI.

NVIDIA’s Blackwell Chips are Sold Out for the Next 12 Months

In reference to its Hopper 100 AI chips, NVIDIA stated that just two NVIDIA HGX supercomputers powered by Hopper GPUs costing $500,000 total could replace 1,000 nodes of CPU servers costing $10 million for AI workloads. Its next-gen AI GPUs, Blackwell, are 4X more powerful than its Hopper H100 AI chips. As of October 2024, NVIDIA’s Blackwell GPUs (B100 and B200) are completely sold out for the next 12 months. NVIDIA’s DGX B200 Blackwell AI systems are priced starting at $500,000, powered by eight Blackwell GPUs and HBM3E memory bandwidth up to 72 petaFLOPS (72 quadrillion floating-point operations per second) of training performance and 144 petaFLOPS of inference performance.

Its core hyperscaler customers like Meta Platforms Inc. (NASDAQ: META), Alphabet Inc. (NASDAQ: GOOGL), Amazon.com Inc. (NASDAQ: AMZN), Microsoft Co. (NASDAQ: MSFT) and Oracle Co. (NYSE: ORCL) have already purchased all available supply of Blackwell GPUs even before they hit the market. They will start shipping at the end of 2024.

Here’s Why Taiwan Semiconductor is the Leading AI Chip Manufacturer

Taiwan Semiconductor Manufacturing Co. Ltd. (NYSE: TSM) is the world's largest semiconductor manufacturer, with the most advanced fabrication (fab) plants. AI GPUs require the most advanced smaller nodes, which enables better performance and lower power consumption with greater transistor density. Currently, AI GPUs use the 7 nanometers (nm) process, 5nm process and 3nm process for next-gen AI GPUs. Advanced nodes are needed for the fabrication process, which assembles the chips. Once assembled, they need to be packaged to ensure performance and power efficiency.

Taiwan Semi’s chip-on-wafter-substrate (CoWoS) packaging technology enables placing multiple chips like GPUs, memory and interconnects on a silicon interposer, which is attached to a larger substrate. CoWoS capacity is another limiting factor for producing AI GPUs. NVIDIA has been taking up all the capacity, causing production bottlenecks. Taiwan Semi and NVIDIA have been working to alleviate the bottlenecks to bolster CoWoS capacity.

Finite Capacity Restraints

AMD has announced that it expects to start production on its MI325X AI GPU a the end of 2024. It hopes to release its AI chips on an annual basis to compete with NVIDIA. Its 2025 AI chip is the MI350, and it will be MI400 for the 2026 AI chip. However, it's important to keep in mind that capacity restraints still exist with the growing demand for AI chips.

NVIDIA is Taiwan Semi's largest AI GPU customer and largely takes up most of its capacity. NVIDIA’s business is estimated to have generated 11% of Taiwan Semi’s 2023 annual revenue. It is a big driver behind Taiwan Semi’s 36% YoY revenue growth to $23.5 billion in its third quarter of 2024. While Apple Inc. (NASDAQ: AAPL) is still the largest customer, NVIDIA has grown to be its second-largest customer. Its 5nm and 3nm process nodes account for over 50% of its revenues driven by AI demand.

Here's Why AMD Can’t Catch Up Anytime Soon

Since NVIDIA is Taiwan Semi’s second largest customer, their massive and growing AI chip orders may be prioritized due to the high demand and margins. The limited supply of advanced nodes and CoWoS is likely being prioritized for NVIDIA, which may lead to production constraints for AMD, assuming they are able to sell a mass number of AI GPUs. If NVIDIA’s Blackwell GPUs are sold out for the next 12 months, then it may be tough for AMD to be mass producing its MI325X AI GPUs. Taiwan Semi is expanding capacity as they expect 2025 capex to be around $30 billion.

However, another constraint preventing AMD from catching up is that NVIDIA's AI GPUs use its programming language and API called CUDA (Complete Unified Device Architecture). AI developers believe CUDA to be a standard for general-purpose GPU processing for heavy computational applications like computer graphics, machine learning, data mining, deep learning, high-performance computing (HPC) and computer modeling. AI developers are locked into the NVIDIA ecosystem from both hardware and software angles, making the moat even harder for AMD to cross.