Anthropic in Talks to Run Claude on Microsoft's Maia 200 Chip

If the deal closes, Anthropic becomes Maia 200's first major external customer and adds a fourth chip vendor to its already-unusual multi-supplier strategy.

Saganote
Saganote ·
3 Min Read

Anthropic is in talks with Microsoft to run Claude on Maia 200, Microsoft's custom AI accelerator built for LLM inference workloads. CNBC first reported the Anthropic Maia 200 discussions on May 21, with Bloomberg separately confirming the talks are active. Both sides have yet to sign a deal. If one closes, Anthropic becomes the first major external customer for a chip Microsoft has spent more than two years developing - and adds a fourth hardware vendor to an already-unusual multi-supplier strategy.

Maia 200 Delivers 30% More Tokens Per Dollar Than Microsoft's Current Hardware

Maia 200 runs on TSMC's 3nm process, delivers about 10 petaflops of FP4 compute and five petaflops of FP8, and draws 750 watts. Memory sits at 216GB HBM3e running at seven terabytes per second, backed by 272MB of on-chip SRAM. Microsoft CEO Satya Nadella said in April the chip produces more than 30% better tokens per dollar compared to the company's current server fleet. That figure applies to inference workloads specifically, not training.

No external lab has publicly committed to Maia 200 yet. Amazon and Google both spent years before their custom chips attracted outside customers - AWS launched Trainium in 2020 and Inferentia before that, and broad enterprise adoption only followed once inference costs became a real pressure point for AI labs. Microsoft needs that external validation. An Anthropic Maia 200 deal would give its chip program the first credibility signal from a frontier model lab, and that matters for the commercial case Microsoft is still building.

Anthropic Already Splits Claude Across Three Chip Vendors

Anthropic already runs Claude on three separate hardware vendors: Nvidia GPUs, Google TPUs on GCP, and AWS Trainium. Four vendors is unusual. Adding Maia 200 on Azure builds a strategy around ensuring no single supplier controls Claude's inference economics - and for a lab that prices by the token, that is the closest thing to insurance against Nvidia's pricing power.

That approach contrasts directly with OpenAI's Jalapeño, a custom inference chip Broadcom built to OpenAI's own specifications and unveiled this week. OpenAI controls that hardware outright. Anthropic bets on renting from multiple vendors rather than owning silicon - a choice that aligns with its position as a lab that hit $30 billion in annualized revenue while keeping compute options open across every major cloud.

For Developers on Azure, Custom Silicon Changes the Cost Math

Claude is already on Azure. Microsoft's Foundry platform routes Claude API calls today over standard Nvidia hardware. No Maia 200 yet. Routing Claude inference there instead shifts where the margin goes - away from Nvidia GPU rental fees toward Microsoft's own silicon. For developers calling the Claude API on Azure, Nadella's 30% tokens-per-dollar figure suggests real cost headroom, though Anthropic has announced no pricing changes and none are expected before any deal closes.

Microsoft and Anthropic both declined to comment to CNBC. Neither company has set a public timeline. Bloomberg confirming the talks rules out a single-source report - but active discussions and a signed contract remain two different things.


Share this
Saganote

About Author

Saganote

Saganote is an independent technology publication covering artificial intelligence, startups, cybersecurity, consumer technology, science, and innovation. Our editorial team reports on the companies, products, and ideas shaping the future.