Microsoft’s AI chief says superintelligence is near, but won’t take your job

Published 8 June 2026 · tech

Today I’m talking with Mustafa Suleyman, the CEO of Microsoft AI. And I’m actually going to keep today’s intro short — I’m working from my

Today I’m talking with Mustafa Suleyman, the CEO of Microsoft AI. And I’m actually going to keep today’s intro short — I’m working from my wife’s family farm this week, as you’ll see in the video, but also this is a real burner of an episode. We covered everything from Mustafa’s approach to training new models to his criticisms of Anthropic talking about Claude as though it is conscious. Of course, we also talked about Microsoft’s relationship with OpenAI, how Mustafa is thinking about all the negative polling and political pushback around AI right now, and whether any of the consumer products are good enough to overcome it. Like I said, it’s a burner. Okay: Mustafa Suleyman, CEO of Microsoft AI. Here we go. This interview has been lightly edited for length and clarity. Mustafa Suleyman, you are the CEO of Microsoft AI. Welcome back to Decoder. Great to be with you again. I’m very excited to talk to you. Our previous conversation was one of my favorite conversations — about AI, how it should make us feel, and what it’s for — that I’ve had in all the conversations we’ve had. There are some big changes at Microsoft, maybe some very important recontextualization about how people feel about AI that I want to talk to you about in particular. And then there’s Microsoft Build, the big Microsoft developer conference, which featured lots of new announcements and lots of big ideas about what computers are for and maybe where they should be that I want to get into. Let’s start at the very start. This is some deep Decoder stuff that is important to understand before all the rest of it. Since you joined Microsoft, you have restructured how AI works there. Your role has changed. The last time I talked to you, you were in charge of a bunch of consumer products. That has since been set aside. You’re now training new models; you’re on the frontier. Explain how Microsoft AI is structured now and how it’s structured inside Microsoft. I guess the last 15 to 18 months or so we’ve been on this journey to reestablish our relationship with OpenAI, and it’s taken a minute. I think it culminated in a new contract that we got done in October of last year. And there were lots and lots of different provisions in that, including cementing and extending the partnership, but crucially freeing us up to be able to pursue superintelligence independently as well as keep buying and licensing their models. So since October, I’ve been assembling the Superintelligence team, building clusters of sufficient scale to train frontier models, and hiring a team focused on superintelligence. And so that was quite a big shift for us because it sort of enabled me to focus just on the superintelligence mission, and that has then culminated in a few things that we announced this week at Build. We have seven new models across all the modalities and so on. So it’s been a pretty big shift, and I think a long time in the planning, and a great relief for us to now be in the game and pursuing the absolute frontier over the next few years. Was this the plan when you were hired at Microsoft? It’s certainly been the plan for the last 18 months. I mean, I think the relationship with OpenAI has gone through lots of ups and downs. And in many ways, I think it is going to go down as one of the most successful partnerships in history. It’s been great for OpenAI, and it’s been great for Microsoft, and all good relationships evolve, and I think this is just the next stage in our evolution. Let me ask you about that evolution specifically. We all just saw the trial between Elon Musk and OpenAI and Sam Altman. Microsoft was involved in that trial in the sense that every so often a lawyer from Microsoft would stand up and say, “And we weren’t around.” And someone would say yes, and that was that. But obviously, what came out during that trial, what has been clear during this entire time, is that the original notion was that OpenAI would be a research lab and provide models, while Microsoft would build the products. Microsoft had expertise in going to market; it had expertise in enterprise, it was trying to regain a foothold in consumer in a variety of ways. This would be a platform shift, and the research work would be over at OpenAI, and the product work would be inside of Microsoft. That’s the thing that changed: OpenAI wanted to make more and more consumer products. Obviously, given your new role and your new focus, Microsoft more and more wants to make its own models. Why the split? What didn’t work in that relationship? I mean, I think OpenAI is led by an incredibly ambitious founding team, and Sam himself. And so naturally, as they started to get more traction and generate a ton of revenue, they saw opportunities to go full stack. So it wasn’t just that they started working on consumer products. Obviously, ChatGPT was incredibly successful. They also started working on their own data centers. They started creating their own chip. There are lots of rumors flying around about their own consumer hardware devices. They started taking models direct to market through ChatGPT Enterprise. So across the stack, they were kind of broadening way beyond research over the last two, three, four years. And naturally, the same is also true for Microsoft. I mean, I think the partnership’s now five or six years old, and still has another four, five, six years to run. Likewise, we’re one of the largest technology companies in the world. We have 493 of the 500 largest companies that store and process most of their data on our systems, use Azure, use M365 and Teams. I think people often underappreciate how enormous we are and how big our distribution is in enterprise. And so, long term, and I do mean over five, six, seven, 10 years, we have to make sure that we’re completely sustainable, and we’re not just a recipient of somebody else’s IP that we then slightly modify and adapt and put into production for our products, but we actually can stand on our own two feet and create world-class models. I mean, superintelligence is coming. I think it’s just around the corner. And so I think it’s going to be basically the most valuable technology of all time. There’s sort of no way that, long-term, we could be structurally dependent on a third party for providing that IP for all eternity. So that’s been the transition that obviously was triggered when OpenAI and so on had their board issue. But then as I came in and my team came in, we started building that out, we’re on that transition. And I think we’re in a great spot because we can take a fairly steady, careful, long-term optimal position, both for OpenAI, which I think has done incredibly well out of this, and for us. I want to spend some time on superintelligence. I just want to put a pin in it now because I just want to kind of understand the transition for one more turn here. There’s a moment in the trial, sort of very funny message from Microsoft CEO, Satya Nadella, he says, “I don’t want to be Intel and have OpenAI be Microsoft,” which is very funny in the context of Microsoft CEO himself saying, “I don’t want to be the provider, and have them be the platform that provides all the value and collects all the value and maybe we’ll be swapped out. I don’t want ChatGPT to run on Azure, and then OpenAI will get all the value, and then maybe they can swap us out,” just as what happened with Windows and Intel over time. Is that a realization? Did Nadella come to you? What was that meeting like where you said, “Okay, OpenAI had its board issues. We need to get back on the frontier and stand on our own two feet.” What did that conversation look like, and how was that decision made? I mean, obviously that’s Satya’s decision as well as Amy, Brad, and many other people in the company. But I think it’s as with anything: these are slow-moving changes in the company, as it comes to realize that the direction that we’re taking needs a little bit of tweaking and adjustment. And so that was happening way before the November board incident, and I think it just builds up over time as you look at the kind of constellation of different fronts around which we’re competing directly, increasingly, and all the tension that comes from that. But also just knowing that partnerships like that don’t last forever. I mean, OpenAI wants to be a trillion-dollar public company, has incredible revenues, and is growing like crazy. They want to have the freedom to operate and be able to buy compute from all sorts of other places, build their own compute, and partner with whoever they want. So the contract was formed at a time when the companies were very different in terms of size and scale and balance of needs and stuff. I think it made sense for that moment, but then it became pretty clear that this is something that we have to be able to own and control ourselves and do right by our own customers. As I said, we have an incredible distribution on enterprise, which I think is just completely unrivaled in the world. And so we have to make sure we’re building the best things for our customers. That looks slightly different to a company that has been jointly optimizing both for the consumer, with ChatGPT, and for the enterprise, and also for the fundamental science mission of superintelligence, which includes a whole bunch of different directions which are overlapping but could arguably be said to be orthogonal to the consumer and the enterprise directions too. Naturally, I think that’s how partnerships evolve, and they get reset periodically. Yeah, but building a frontier model is very expensive, I’m told. Reliably told, this is a very expensive project. At some point, Amy Hood, the CFO of Microsoft, has to say, “Yep, you’ve got the budget.” When did that happen? Was that just a text message? Was there a meeting? Tell me about the specifics there. I think, look, we sort of made the decision in the early part of last year, which obviously informed all the contract negotiations, which then all got resolved and signed in October. And it is a significant investment, but we have a long time to make it. I mean, we’ve already made significant investments in our own self-sufficiency mission. Our Maia 200 chip is actually an outstanding chip, as one example, right? We are now able to manufacture and ship a chip that is 30 percent cheaper than a GB200 inside of our own clusters. And now that we can co-design our own models with it, the MAI-Thinking-1 model that we’ve just released actually delivers 1.4x performance per watt improvement on top of the 30 percent improvement that you get from running on a Maia 200 once we co-optimize the models for our tasks. So the value of making sure that you own and control your own stack and direct the entire co-design effort end-to-end for the use cases that are most important to us — which is obviously agentic coding, our developers, our enterprises — that clearly pays the dividends that justify the investment that we have to make over the next few years. You said self-sufficiency mission, which is a very polite way of saying you want to stand on your own two feet; you want to do your own thing. I’m told there’s some controversy inside of Microsoft about a line my colleague Hayden Field wrote in a piece describing Build. I’m just going to read this. This is from Hayden. It’s a great line. She said, “This year’s Microsoft Build had the vibe of a freshly single divorcée posting a thirst trap on Instagram.” The breakup is completed, and it’s time to flex. Here’s our new model. We’re going to stand on our two feet. You’re out there saying you’re going to build models at the frontier and compete with the leading labs. Is that the feeling inside of Microsoft that you’re free to be on your own? Definitely not. No, not at all. Look, I mean, obviously that’s a cool headline and a fun phrase. But the reality is that we are in partnership with OpenAI for years and years to come. I mean, we’re running way north of 2030. They still produce the best models in the world. GPT-5.5 is an outstanding model. The Codex, the cybersecurity models that are coming through, are amazing, and they’re powering the majority of what we do. So naturally, that’s going to continue. And so I think that’s just a natural course of these sorts of partnerships. I don’t think it’s anything untoward or surprising. I think OpenAI is very understanding and supportive of that. I mean, they’ve obviously been an incredibly fast-growing company, and they understand that we have to pursue our own agenda as well. So it’s very normal. Let me ask you the other Decoder question, and then I want to get into the announcements at Build, and certainly superintelligence. The last time we spoke, you said your framework for making decisions operated on a si week cycle, given how fast AI was moving. That made sense then. Things have settled, maybe. Maybe some things are more in focus. What is your decision-making framework now? We still operate by the same cycle rhythm. At the end of each cycle, we have a one-week meetup in person. I’m a real believer in this, even though we’re still an in-office culture, four days a week. In fact, the week after next, my entire Superintelligence team comes together in person in Boston for four days. That is for all of our retrospectives on how Build went, what we learned, what we didn’t get right, what we need to improve, our planning for the next cycle, which is going to run for eight weeks this time with a one-week meetup afterwards, and that’s all laid out for the entire year. So the whole organization knows that that’s the rhythm by which we operate. And I think it’s actually really important to emphasize that timeframe, because quarterly planning gets a little bit blurry and a bit abstract. I think six to eight weeks, depending on where it falls in the calendar, is actually the optimal time for making very clear, fortifiable missions. So we also, in addition to the rhythm of these si to-eight-week cycles, operate by squads. The squads are mixed interdisciplinary subgroups that are focused on a specific mission, and they don’t necessarily ladder up to the manager. They actually are run by a DRI, and the DRI is often an IC, and their job is That’s “directly responsible individual” and “individual contributor.” Yeah, exactly. Thank you. And I think we’ve taken the approach of separating the role of the manager from the role of the DRI that executes on a specific mission. I think that’s because being a great DRI is exhausting. You’re literally all-in 24 hours a day, and you’re pushing as hard as you possibly can. Being a manager is often about being a coach, offering support, giving guidance, feedback, unblocking all sorts of things, helping with people’s career growth. And so I think keeping those separate allows us to rotate DRIs every two or three cycles so that some people can try sort of different positions and have rotation. It’s a great, very flexible structure that allows us to be pretty nimble, I think. Let’s talk about Build. I wanted to start with superintelligence. You’ve mentioned it several times now. I was just at Google IO. Demis Hassabis, who used to be your colleague when you were at Google, ended that keynote by saying that we were in “the foothills of the singularity, and that AGI was coming with all the power of Google.” You’re saying superintelligence is here. Are these all the same things? Are we using different language to describe AGI? Are there differences? How would you define superintelligence in your context versus the singularity in Demis’s? I mean, obviously I didn’t say it was here. I said it’s coming. And I think there’s a lot of fluidity around these phrases. But I think what we can clearly see that what’s happening right now is that there is log-linear hill climbing across all modalities, and that means that there is a very direct relationship between each order of magnitude of compute that we apply, each incremental increase in data, and climbing on benchmarks, whether they’re public benchmarks, internal benchmarks, they’re targets that we focus on with reinforcement learning environments. And that is a very important observation. Those predictions that I think we’re all making — I understand why some people are sort of skeptical of them or raise questions, but they’re very grounded in the sort of empirical observations of over a decade of increase in performance of these models. I mean, essentially the same general-purpose architecture has seen 12 orders of magnitude more computation applied, a trillion-fold increase in FLOPS over 15 years, and basically has worked in audio, in image, in text, in code, and in many other time series prediction tasks. And so we’re basically extrapolating out that more orders of magnitude of compute will enable us to continue to climb in this log-linear way inside of other environments. And then it raises the question of, are we going to be able to train models that can invent new knowledge, not just sort of extrapolate from existing data that we have, but actually teach us things that we don’t know, and make new discoveries? Then the second thing is, do they have the capacity to self-improve and accelerate the process of deciding which hypotheses should be set, which ones should be pursued, how to generate training data for each of those, how to factor those into new runs, or even innovate on the actual architecture itself? So, I think both of those things need to be true to be able to see this compounding progress, but I think we’re going to continue to get massive gains just from applying the next few orders of magnitude of compute.