A custom rack for the Maia 100 AI accelerator and its “sidekick” inside a thermal chamber at a Microsoft lab in Redmond, Washington. The companion acts like a car radiator, circulating liquid to and from the rack to cool the chips as they handle the computing demands of AI workloads.
Jean Brecher | Microsoft
Microsoft unveiled two chips on Wednesday at its Ignite conference in Seattle.
The first, its Maia 100 artificial intelligence chip, could compete with Nvidia’s highly sought-after AI graphics processing units. The second, a Cobalt 100 Arm chip, is intended for general computing tasks and could compete with Intel processors.
Cash-rich technology companies have started offering their customers more options when it comes to cloud infrastructure that they can use to run applications. Alibaba, Amazon and Google have been doing this for years. Microsoft, with about $144 billion in cash at the end of October, had 21.5% cloud market share in 2022, behind only Amazon, according to one estimate.
Virtual machine instances running on Cobalt chips will be commercially available through Microsoft’s Azure cloud in 2024, Rani Borkar, the company’s vice president, told CNBC in an interview. She did not provide a timetable for the Maia 100’s release.
Google announced its original tensor processing unit for AI in 2016. Amazon Web Services unveiled its Graviton Arm-based chip and Inferentia AI processor in 2018, and announced Trainium, for training models, in 2020.
Special AI chips from cloud providers could help meet demand in the event of a GPU shortage. But Microsoft and its cloud computing peers don’t plan to let companies buy servers containing their chips, unlike Nvidia or AMD.
The company built its chip for AI computing based on customer feedback, Borkar said.
Microsoft is testing how Maia 100 meets the needs of its Bing search engine AI chatbot (now called Copilot instead of Bing Chat), GitHub coding assistant Copilot, and GPT-3.5-Turbo, a large data language model. “OpenAI supported by Microsoft,” Borkar said. OpenAI has fed its language models with large amounts of information from the Internet, and they can generate email messages, summarize documents, and answer questions with a few words of human instruction.
The GPT-3.5-Turbo model runs in OpenAI’s ChatGPT helper, which became popular shortly after becoming available last year. Companies then quickly added chat-like features to their software, increasing demand for GPUs.
“We have worked at all levels and (with) all of our different suppliers to help us improve our supply position and support many of our customers and the demand they have submitted to us,” Colette Kress, director financial of Nvidia. , said at an Evercore conference in New York in September.
OpenAI has already trained models on Nvidia GPUs in Azure.
In addition to designing the Maia chip, Microsoft designed custom liquid-cooled hardware called Sidekicks that fit into racks right next to racks containing Maia servers. The company can install the server racks and Sidekick racks without the need for an upgrade, a spokesperson said.
With GPUs, making the most of limited data center space can pose challenges. Companies sometimes place a few servers containing GPUs at the bottom of a rack as “orphans” to prevent overheating, rather than filling the rack from top to bottom, said Steve Tuck, co-founder and CEO of the startup. Oxide Computer servers. Companies sometimes add cooling systems to reduce temperatures, Tuck said.
Microsoft could see faster adoption of Cobalt processors than Maia AI chips if Amazon’s experience is any guide. Microsoft is testing its Teams application and Azure SQL Database service on Cobalt. So far, their performance is 40% better than Azure’s existing Arm chips, which come from startup Ampere, Microsoft said.
Over the past year and a half, as prices and interest rates have risen, many companies have looked for methods to make their cloud spending more efficient, and for AWS customers, Graviton is one of them. All of AWS’s top 100 customers now use Arm-based chips, which can yield a 40% price-performance improvement, said Vice President Dave Brown.
However, moving from GPUs to AWS Trainium AI chips can be more complicated than migrating from Intel Xeons to Gravitons. Each AI model has its own particularities. Many people have worked to make a variety of tools work on Arm because of their prevalence in mobile devices, and that’s less true in silicon for AI, Brown said. But over time, he said, he would expect organizations to see similar price-to-performance gains with Trainium compared to GPUs.
“We’ve shared these specifications with the ecosystem and with many of our ecosystem partners, which benefits all of our Azure customers,” she said.
Borkar said she didn’t have details on how Maia performs compared to alternatives such as Nvidia’s H100. On Monday, Nvidia announced that its H200 would be released in the second quarter of 2024.
WATCH: Nvidia records tenth consecutive day of gains, driven by announcement of new AI chip