Transcript: How OpenAI's Codex Team Builds with Codex — Alex & Romain

Watch video

Peter Yang43:17Transcript ✅Added May 31, 3:51 pm GMT+8

Source video ID: 9qXc-THAvc0

Transcript

0:00 — We write very very few specs on the Codex team. We like talking like 10 bullets or something and then that’s it. The designers on the Codex team write more code now than like was written by an engineer like 6 months ago. And I made a quick prompt to create a little 2D game. Maybe add some more decorations, houses, trees, and stuff. Could you add some more decoration like trees? And there we go, we have already like new trees appearing. For a small change, it’s like often faster to send a PR than it is to like communicate to someone and get them to prioritize that task when they have like 10,000 other
0:30 — things to do. I don’t actually view PM as like a leadership position. I view it as a fill-in-the-gaps position. I think the fewer people you need in a room to do anything, just the better that thing goes, the more pure every decision is. Okay, welcome everyone. I’m really excited to host today Alex and Roman from OpenAI’s Codex team. They’re going to demo how they build new features of Codex, what Codex is capable of, and also talk about how the Codex team ships nonstop. So, welcome, guys. Thank you. Thank you for having us. Yeah, excited to chat. So, do you guys
1:01 — want to just quickly show what kind of things Codex can actually build in one shot? Yeah, for sure. I mean like let me like share my screen to to give you a sense. And so like there’s so much I could show, but maybe like a quick glimpse into like for instance, here is an iOS app I’ve been building. And if I want to actually create a new feature for this app, I can simply dictate and voice over something that says, “Hey, can you add a new screen for NASA’s Artemis mission, return to the moon?” And I can send that prompt with GPT 5.4
1:31 — and sure enough, the model will like create a new screen for this particular iPhone app. So here we have this app. It’s pretty cool and it’s currently building this new feature, so we should see that in a moment. But we also have the Codex Spark model which can really help you ideate and iterate in just a few seconds on anything. In fact, let me show you like while it’s working over here, the difference of what it does to have a spark model responding so quickly. On
2:01 — the left side you have GPT 5.4, right? I’m going to give it a head start. And on the right side you have Codex Spark. And boom, you have like 1,200 in a second on average. This is insane speed. And so when you want to build something, let’s say a game, right before we started this conversation, I actually went to the Codex app and I made a a quick prompt to create behind the crossing, a little like 2D game where I can start building. What also I love using with the Codex app when I’m in the
2:31 — flow is taking the Codex app like this and pop the conversation out on top of the screen, right? And so this way now what I can do is like if I’m like actually working on this game, I can keep iterating and have more ideas. Um, I don’t know what we want to do. Do you have an an idea, Peter, for what you would like to change on this game? Um, maybe add some more decorations, houses, trees, stuff. Make it more lively. Could you add some more decoration like trees? And I’m going to send this task and in
3:01 — basically a handful of seconds, Codex Spark will be able to edit and we’ll see the changes live. And there we go, we have already like new trees appearing. And I can keep on playing. This is insane speed, right? And uh so that’s why I’m so excited about about Codex. You can really have frontier models like GPT 5.4 that can take on very complex tasks like, you know, millions of lines of code to analyze or migrate. But if you’re in the flow and you’re really feeling inspired, you can reach for like the fast model even Codex Spark and you
3:33 — have this like insane speed of thought where you can really build anything. So this is just a quick uh tour for how to build with Codex. This episode is brought to you by Granola. If you’re in back-to-back meetings, you know how much work it is to take notes live and clean them up afterwards. That’s why I love Granola, the best AI meeting notes app in the market. Here’s how I use it. Granola automatically takes notes during a meeting and I can add my own notes, too. After the meeting ends, I use a Granola recipe to extract clear takeaways and
4:04 — next steps in the exact format that I want. Then I can just share her notes directly in Slack with my colleagues or even get Granola to share her notes automatically. Honestly, of all the AI apps that I use, Granola is the one that saves me the most time. Try it now at granola.ai Peter and use the code Peter to sign up and get 3 months free. That’s granola.ai/peter. Now, back to our episode. I’m really curious uh how you guys actually build products with Codex on the team, right? Alex, do
4:34 — you even write specs anymore? Or like do do you get GPT to write spec? Like which model do you use when uh to make the stuff work? Yeah, I think um we write very, very few specs on the Codex team, actually. I think uh a lot of the work is like, you know, let’s have the people closest to the metal making as many decisions as possible. And so, the only time that we’ll write a spec is if it ends being a problem that’s like kind of hard to fit in one person’s brain. Right? And by the way, like one person can fit a lot in their brain now because they can do a lot. Like they’re
5:04 — delegating most of the coding. Mhm. Right? And so, one person can do a lot. But if it ends up being a thing where we’re like coordinating across a few people or maybe it’s like a really thorny decision that we have to make, then maybe we’ll write a spec. But these the docs that we do write in these cases tend to be incredibly short. You’re talk We’re like talking about like 10 bullets or something. And then that’s it. » Okay. Can you guys like show me how this works? Like like you give Codex like a few bullets and then maybe it writes the actual requirements first? Like an empty file? » Yeah. Yeah, we could do that. And also, like one thing I want to show you also
5:34 — very simply, imagine like going back to this iOS app I was mentioning that’s like currently like finishing a task over here. Uh imagine you have like ideas for new features for that new project and you have some ideas, but you’re not exactly sure where to go. What is very exciting now with with Codex is that if I start actually talking about let’s plan the next steps, you can see that like Codex has automatically understood that I’m trying to make a plan for what we should build next. And so if I simply do
6:04 — shift tab, it will enter plan mode and if I say like what should we build? Uh like I can use Codex as a brainstorming partner to plan. And in this plan mode, it’s going to look at the code, it’s going to look at where we are so far in the project, come up with ideas, and then I can also bring my own ideas and and start steering the the model into a good plan. Mhm. And so based on that, you can see for instance as we speak here that Codex has some ideas based on like what what it’s
6:34 — looking for and and and the files. And so here, it’s going to actually prompt me for some for some guidance. Like what should we do? Should we like start working more on the Artemis idea that we were just doing? Should we just do a pass on reliability, a dashboard? Maybe we’ll say like, yeah, maybe a reliability pass is good. Wait, who should we optimize for? And so I can use Codex this way. And of course here, I gave no input. So in the case of Alex, you know, as a product lead, you I’m sure you would give a lot more guidance up front. But here, I’m kind of letting
7:04 — Codex take some ideas » It’s funny, I do this a lot, too. And drive them. Yeah, like often I’ll Okay, so there’s like various kinds of changes, right? There’s like the super simple change, you just go straight in, you just profit. Yeah. » Right? Um then there’s there’s like sort of a medium complexity change, where maybe you’ll like reason about how to do it or ask for a specific plan. But something that I actually do is kind of like similar to this, where if I have like a vague idea, I might just go into Codex and just ask it to start like thinking about how it might solve a
7:34 — problem. I don’t even have a feature in mind. And then like you know, it’ll like go explore and like ask me some questions and like in my case, I often don’t end up even using that thing cuz maybe this is quite a complex change that like so you know I so there’s a digression here but like what code do PMs write is an interesting thing to get back to but or like maybe a complex change like I don’t actually want to be on the hook for like landing and maintaining that change but I’ll still go through the motions of like a plan mode and like exploring it and then that I just have a better mental model of what we need to do and then that becomes something that like not the plan itself
8:04 — but like just the thinking becomes something that I share with an engineer. I feel like so like to take that digression see like the designers on the Codex team uh right like you know we like to share like more code now than like was written by an engineer like 6 months ago. They’re like absolutely goated. Um but obviously the tool is a massive part of this. Um and you know uh the team was making fun of me for not landing that many PRs in the last year. I’m not going to give you the number but I’m like yeah I should be more um especially when you consider how many
8:34 — of those were very small tweaks. Um but I feel like we’re at a point now where like it’s not about can you generate the code. Um like the agent is amazing you can delegate tasks to it. It starts to be a point of like what are you deciding to do that’s actually super important like are we aligned on what what this thing is becoming? And then on the other side of it it’s like how we making sure the thing is really high quality. Like you know like some folks will say proudly say like the entire app is vibe coded. Like in the case of Codex like the vast majority of code was generated by an agent but we still spend a lot of care and attention like thinking about
9:04 — the system and making sure it’s really high quality. And so that’s why for instance if there’s a really complicated feature I often will make sure that there’s like a more robust stable owner to own it. And I don’t I don’t think you want a part of the value of a PM is they can be like super distracted and like go around and so you don’t necessarily want PMs owning these systems. Totally. Yeah you don’t want a PM to like maintain a feature code that doesn’t sound like a good idea. I think I think we’ll screw it up. Yeah. Yeah. Okay. And and and like yeah some some of the other products out there are are like like I love the other products too but like you have to spend so much time learn learning them. I almost feel like if I’m
9:34 — not on Twitter Twitter I would have no idea how to use the other product And like one thing I really love about Codex is like how simple the app is to use, you know? It’s like very intuitive and very simple. But but there is like some pretty advanced features like skills and automations, right? Like do you guys use that stuff internally? Yeah, a ton. A ton. In fact, like I think skills are like the most interesting things that like the Codex app surface enables you to use. Like for instance, like imagine you’re pairing with designers that use
10:05 — Figma. Well, it’s amazing now to turn on the Figma skill to kind of pull details directly for from the Figma files, all of the React components, the variables, and then like Codex will like implement this code accordingly. But imagine like you’re building an app, like maybe you want to share it and you want to deploy that to Vercel or Cloudflare or Render. Those skills are like right there and so you can just simply like tell Codex what to do and we’ll basically connect to this ecosystem of tasks.
10:35 — Mhm. It’s funny. I was talking to a friend like a couple nights ago. He was telling me that like he had like a ton of ideas to improve his product and he told Codex, “Just write all of these tasks on Linear so I keep track of everything as we go using that skill.” And at the end of it, he’s like, “Well, now I’m going to bed. Go ahead and implement all of these tasks that we just discussed and cross them off.” And sure enough, he woke up and everything was was actually complete. That’s amazing. Coming back to your point about like the simplicity of the app, I think it could be interesting to share a little bit about how we think
11:05 — about designing it. I don’t know if that’s interesting. Yes. Mhm. Kind of like there’s there’s something that’s really interesting about building in this space is that developers love just like automating tools for themselves, like building tools for themselves, automating parts of the work. And so I feel like a really important part of the product is that it’s like super configurable. Right? And so like for us, Codex, like the harness is open source. Like you can go deep in. Like whenever we’re building a feature, we start getting complaints on Twitter that the feature, which by the way is like not enabled in prod, is like broken because that people are
11:35 — going in and like changing the code themselves or forking it to like get these new features working. Um but for me that’s like an awesome part of the product, right? And what that means is like the cutting edge of your users are just absolutely living in the future with us and pulling us into that future. Um on the other hand though, like if you only build for that, you end up with this thing that’s like nearly impossible to understand and you used to spend all day on Twitter like you were saying. And so we kind of have this view of like we are really careful about like what the core primitives of what we’re building are. Like that’s the place where this stuff will be written down
12:05 — and it’s not just like a vibe coded thing. It’s like we’re really thoughtful like, “Okay, how do we mostly let the product be almost invisible, get out of the way of the model and just let the model like and every time the model gets better, just do more and more?” Mhm. Um then from there, how do we package this in as configurable a way as possible for power users so they can figure out what it is? For instance, there’s an implementation of like sub agents that’s out in the wild right now. Um and people are like using it and experimenting and we’re learning a ton from them even though we don’t actually trigger that proactively in the product.
12:36 — It’s just like something that users can learn about and go use. Mhm. Um we learn from how people are using it and then from there we think about, “Okay, now how do we make that super simple for everyone else?” So like the Codex app is actually an example of this where around the time of like uh I would say like GPT um GPT 5.2 Codex in December, um all of a sudden like it was like incremental steady model progress, but we just kind of cleared this point where you could start delegating like way longer tasks the model and it would just like one-shot it anyways.
13:06 — Yeah. » Um and so we started to see like people were already teamuxing like or, you know, for anyone who doesn’t know what teamuxing, people are already like running many parallels in terminal. Um but we started seeing like crazy like things on social media like this is one picture of like Peter Seinberger with like, you know, the creative open claw with like, I don’t know, like 18 terminals across eight like three monitors. So we started seeing people like using Codex in this very advanced way, we were very excited. We kept making sure the delegation worked well in the basic product like CLI, but then we were like, “Okay, like maybe the top 1% of engineers are
13:37 — going to work that way. How do we make this feel really intuitive?” And so then we got to the Codex app, which you launch it, it just feels super simple. It’s just like a chat. It’ll do work. But then you can start discovering, “Oh, there’s a sidebar. Oh, I can run multiple tasks. Oh, it’s like really easy for me to click between them. Okay, now I’m like being really effective myself.” And then it’s like, “Oh, there’s a skills tab. Let me like go into here.” And so we try to like make it so you it’s almost like playing a game. You’re just like discovering what’s next. Totally. And I think that yeah, we’ve always had this vision right from the get-go that like coding will take place into this like agentic delegation fashion. Like even when we
14:08 — started Codex almost a year ago, we were always thinking about this like future where as an engineer you’re working on multiple things in parallel, but candidly the models were not quite there yet, right? And I think we needed to see like the inflection point with GPT-5.2 Codex and beyond that to see the model be able to be like super thorough and work reliably for hours on end if not days. And then by that time you’re like, “Well, now this is a weird interface to have multiple tabs open in a terminal and just let them run
14:38 — for hours, you know?” And so then we needed to have this new surface and I think the timing for something like the Codex app became perfect. » Yeah. There’s two vibes shifts in Codex history. The first is like August. So, we launched this Codex Cloud product. It was a great idea. People were super People are still super excited about it. Um but it was a little early. So, around August we shipped GPT-5, great interactive coding model. We were like, “All right, like let’s go solve the problem that the models can solve right now. Ship Codex CLI IDE
15:08 — extension.” Growth started exploding then. And uh you know, I remember it was like we grew like 20 or 30 x in like a few months, which is awesome. Uh and then the second vibe shift was around December-January, where we actually could get back to this vision of like delegating to models. Let’s go a little bit deeper into the development of the Codex app. So, um did you have like some sort of a do you have some sort of like annual roadmap like a year ago like, hey, in this time we’re going to launch the Codex app or is it more like you kind of saw the market doing this stuff and then you kind of prototyped a bunch of stuff? Like how how was this thing built? Okay,
15:39 — so it’s like neither. Um Yeah. and uh I got some really good advice um from a researcher here called Andre. And uh his advice for me was that at Open AI you either plan near-term or long-term, but you can never plan medium-term. Mhm. Um it’s just too difficult. So, near-term is like up to 8 weeks from now. 8 weeks being the absolute maximum. What is a concrete thing that you can like motivate a team to like rally together around and get done? And this is something that we’re really good at Open AI is like kind of like rallying a team around like a a thing that we want
16:10 — to do. Yeah. » Um the other thing you can do is you can kind of have a vibe that’s like, you know, like a year from now we’re going to have models that are way smarter. They’re going to be able to do like, you know, I’m rewinding back a little bit in time now. You know, you you think cuz now like what I’m about to say is like obvious and it’s obviously less than a year from now. But you might be like, yeah, we’re going to have models and um we’re not going to want to lend them our computer to do work because then we can only do one thing at a time. We’re going to want like infinitely many models and they’re just going to be doing work independently, like validating their own work, maybe even deploying the code themselves and
16:41 — monitoring it themselves, and we shouldn’t even have to prompt them necessarily. Mhm. » you can kind of think ahead to like this kind of vibe. Right? Um and the in-between thing is just kind of awkward. So, the in-between thing is like a product roadmap. We just we basically don’t really have those. We have the combination of like a a sort of long-term direction and like things that we think bring us in that direction. So, for instance, in the case of the Codex app, like one of the strategic goals that we had was to dissociate ourselves from a specific workspace. Okay, so that’s a bit abstract. What I
17:11 — mean is that like if you’re using an an IDE like VS Code, which is my favorite IDE, um you open VS code to a specific workspace. Right? The folder, yeah. So, that’s a specific checkout of the code. Yeah, specific folder. Right? And even if you’re using Git work trees, you can only open it to like one Git work tree at a time. And so, you basically can only work on one thing at a time. Right? And the same is true for like a CLI as well. And like because we know we have this vision of like we want people to be working with agents that they’re delegated to in the cloud that are just working independently, we know we need
17:42 — to get to a point where it feels really natural to be talking to like multiple agents at a time or even just one agent that’s like orchestrating multiple agents for you. However, we’ve also learned that if you start in cloud, it can be quite hard for the developer to get value because, you know, your tools aren’t there, you got to do environment setup, it’s like a little bit hard if like to get partial credit for a task cuz like maybe if the model goes halfway, you need to like jump in and course correct or just poke at things. So, we’re like, “Okay, we need a local experience that is separated from a specific folder, but yet feels super intuitive to
18:14 — work with folders on your computer.” And so, when we started the app, we had a bunch of this like vibes thinking up here like esoteric vibes thinking. And then we had a bunch of like prototypes that random engineers had built that were just like, “I wish we had an app and it was like this or that or the other.” And there was actually a hack week where like multiple independent people built different versions of apps. You might have even built one, I don’t remember. » Yeah. Um And so, the project when it got started, um the only thing that really needed to be written down was why we thought it was a good idea to build an
18:44 — app. The like there was no specific spec for the app. Um And you know, eventually we generated one by like through building, but really it was like quite contentious actually like should we even be building an app? Like the IDE extension is super popular. Should we just focus on that and improve the quality there? What about CLI? Like feels like CLIs are a thing. Um And then if we are building an app, like what is the point of building an app and where should we go? So, that’s kind of how these things start. Yeah. And luckily we had the great vision with the IDE extension at the time, which we had polished quite
19:14 — heavily, so that you can use it in like VS Code, Cursor, WinSurf, and others. And so, we took a lot of the learnings and the code base from that IDE extension to have a great starting point that was already robust. Yeah. Yeah, actually like the app shares a bunch of code with Well, it just shares code with the IDE extension. Yeah. » Um and under the hood, both the app and the IDE extension talk to the same core Codex harness in Rust that is open source and that the CLI also uses. So, there’s a lot of like sharing and like very intentional like layering of these primitives. But the the decision to build the app
19:44 — was I mean, now it’s kind of obvious cuz like, you know, just using the Codex app is way easier than having a bunch of terminal windows open. But like the decision to build the app was basically it’s kind of like beginner-friendly and it’s kind of like you you just play with it. It’s like the best in interface to manage multiple agents at the same time. Yeah, I I would say like the way that we think like we’re very much like, you know, we’re very AGI-pilled. So, we’re thinking about like what is the future that we’re skating towards. » I see. So, Yeah. Um I would actually flip the order. It was more like, we know that we need to build an interface where it feels really natural to
20:14 — delegate to multiple agents because we know we’re going to have models that are ready for it. Or in fact, we’re already seeing people delegating across agents. Yeah. » So, we need an interface that feels really natural, that will scale really well to scale really well to cloud, and we want all of that to feel like ergonomic. It shouldn’t feel like, you know, you’re crazily figuring something out to delegate to multiple agents at a time. It should just feel like obviously that’s how you want to work. Yeah. And it appeals, by the way, not just to like, you know, junior developers at all. It’s quite the opposite. Like even the most prolific,
20:44 — most senior um engineers, even at OpenAI, from Peter from Open Claw to like Greg Brockman, they’re now using the app as the primary way to to build. Uh so, this was very much this like agentic delegation vision coming to life, and it’s not just for like, oh, like the most sophisticated engineers will stay in the terminal. They’re actually moving to the app as as Yeah, so hopefully So, okay, we we keep talking about Peter cuz he just joined Open AI and we’re like super excited. Uh you know, again, creative open cloud. But, I don’t know if I told you this, but yeah, I went for a walk with him in like October
21:14 — um at Fort Mason, which is a place in San Francisco. And uh I I I didn’t like outright tell him that we were thinking about building an app, but I was like, you know, I was starting to like poke at this idea of like, you know, some kind of new interface that made uh delegation feel natural. He basically told me he would never use such a thing. Uh and then like last weekend he was like tweeting, “Actually, the app is pretty good. Yep, hell has frozen over. I now like it.” Um Yeah, yeah, I thought the peer too. If you got him to use the app, that’s like that’s like a major accomplishment cuz he has like 20 terminal windows open, so that’s like a huge accomplishment. Exactly. I mean, I
21:44 — I need to follow up with him. He probably uses both, but I don’t know. Yeah. Yeah. So, Alex, you you were like the only PM on Codex for a while, right? And and how many people does Codex have? Like, you know, 50 to 100 people or It sounds about right, somewhere in that range. Yeah, like we were we were like eight people back in May, sorry. Yeah. Uh Yeah, or something like that. I I you know, I don’t remember exactly, but uh we’ve just like grown really quickly since then. Yeah, somewhere in the 50 to 100 range is interesting. So, how do you like spend your time, dude? Like like what what what’s like a typical day
22:14 — like? Or is there a no typical day? Okay, so so in my case um I I was thinking about this recently because I realized that I don’t know how to answer that question. Um and I think what I realized is that I have these like different modes that I operate in. Um and uh you know, this is not this is not advice. This is just me. But, uh I think I have a mode like before, for example, we were shipping the app, which is just like straight-up execution, you know, obsessing over quality, making sure we aren’t like we’re looking around all the corners and like landing every
22:44 — little bit of thing. Um and that mode is like uh spending a lot of time in Codex, actually. Like both to like cuz you know, you can we are not you know, we tend to use Codex a lot to like understand what’s happening. Like I use Codex a ton to understand like what is happening in Slack? Like what is the feedback we’re getting? I’ll have Codex just go and like summarize that, follow up, post it to linear. So there’s like a lot of the like just understanding the state of quality using Codex. Um then there’s a lot of using Codex to understand what the like just things
23:14 — about the code and then using Codex to make changes because nowadays it’s like for a small change that’s like not building a new system which again I try to avoid, but like, you know, taking care of existing systems it’s like often faster to send a PR that is good and you’ve tested than it is to like communicate to someone uh and get them to prioritize that task when they have like 10,000 other things to do because we’re aiming to launch an app in like 2 weeks. Yeah, yeah. » Um so there is that uh and then you obviously there’s a lot of human side of just like cheerleading, rallying, but also being a critic of what we’re building. So um that is one mode that
23:46 — I’ve noticed. And actually you can tell if I’m in that mode if I’m on Twitter a lot, funnily. I don’t know why, but like after we push a launch I tend to get more on Twitter. Um and then there is this other mode which is like where for example, like now it’s like quite top of mind for me that we are at a stage where we have these amazing models. Like GPT-5 from 4 is incredible. Um we also have this app experience that is like even more popular than we anticipated and we now have it on all platforms including Windows. Right? And so now in my mind I’m like, “Okay, it’s time to like really like get back to
24:17 — like cloud and like invest more in that, right? And so when it we enter these kinds of phases I spend much more time like thinking about what to do and like understanding like what is the state of things.” Um and so that’s kind of like a coordination-y mode where actually I’m I am spending less time in Codex. Like I tend to be using Codex more for communication and less for writing code. Um so I have at least those two modes. There are probably more. Yeah. Like how much cross-functional alignment do you have to do? The Codex team is awesome. We do very little cross-functional alignment uh within Codex team. We just kind of like view ourselves as like
24:48 — intentionally a bit of a like pirate ship like team. Um you know, we even within Codex team, you know, now there’s like uh there’s me and now two PMs as of very recently. Um there’s a few eng leads. Uh although until very recently, like everyone reported to Tevo. But we kind of like all just like fuzz around together. Um and so there’s not too much alignment going on there. But in increasingly um you know, I think a large part of of building Codex is building this
25:18 — coding agent. And increasingly it’s clear or it’s probably obvious to everyone now that a coding agent is a really generally useful thing for other work that’s not just coding work. Like we see people using the Codex app for more than just writing code. They’re using it for tasks across the software development life cycle. And then now like we actually like the vast majority of OpenAI uses the Codex app even outside of technical works. They I just see the app up everywhere. And so um you know, that kind of thing realization where it’s like, “Okay, how do we help Codex be like useful beyond
25:50 — just people writing code?” That requires more cross-functional alignment because, you know, as OpenAI, we have ChatGPT. Um which many, many, many people use. So we have to be thoughtful about how we do that. And on my side also, developer experience we’re kind of like an extension of the Codex team now. Like we’re spending most of our effort on Codex. But for a few different reason, one of course is like it’s an exciting product and like developers love using Codex and want to make that better. And to Alexi’s point, like we have a few modes too. Like we are in the trenches with the Codex team to prepare the launches, to prepare the assets, like how to make the most of Codex. And then
26:21 — post launch, we try to educate developers on how to use Codex for these like variety of ways. But the other lens for which it’s very interesting to us is that when you look at the broader OpenAI platform, we have like millions of developers today also building on the API, the models, using different modalities from Imagen to Sora to speech to speech. And guess what? The best way to build has become Codex as the entry point, right? Like if we if you rewind just a year ago
26:51 — or even last summer when we introduced GPT-5, we had to write a lot of the guides around like how you prompt GPT-5. Yes, it’s a reasoning model. It’s quite different from a GPT-4 model. Well, now what we try to do is like even for those use cases, we try to teach developers on using Codex and skills. Like for instance, if you had an integration you want to update, you should probably use Codex and a skill. And guess what? Codex will definitely take care of that for you. So, we are also very very cross-functional and we are seeing Codex as the cornerstone of everything for the developer platform. Got it.
27:23 — One one interesting thing about like how we work together is like I mean, effectively I think the best part of working on Codex is the community, you know, online on the internet. And sometimes in real life and at events, right? And we kind of just like anchor everything about that, you know, so it’s like, okay, launches. We’re very launch oriented. Like when are we shipping something? We’re very feedback oriented. Like when is there feedback from the community and like let’s fix that and like communicate that. And so, we’re all like quite online. And I think like um yeah, like even for example, thinking
27:54 — towards the Codex app launch, like we’re working super closely with Dom and like Dom and Homam’s team like on the Rex, we were like he’s basically like helping us like coordinate like actually quite a wide alpha with like a bunch of users and like building with those users to get feedback, to build skills that like um skills for the app to use at the same time and, you know, documentation everything. And so, you know, I think this is kind of this unique strength we have as a Codex team. Again, because we are open source. And and so, kind of like because we’re open source, we kind of
28:24 — just found ourselves being incredibly open about everything we do. And I think it’s really the community really rewards it. Yeah, dude. Like building with the users in the community is like the best part about being a PM. Yeah, for sure. It’s the best part. Just like talking to them every day. Yeah, now we have like Codex ambassadors even like in many many like cities and countries who are kind of like spinning up their own events to kind of teach their own communities locally cuz I would love to be in every city, but we cannot be. Uh but it’s amazing to see like the energy and and the enthusiasm from the community to like set up these events,
28:55 — these hackathons, and like build together. This is awesome. Yeah, that’s Make make me an ambassador. I’ll I’ll I’ll I’ll throw some events. Yeah, that sounds good. All right, we should Yeah, it’s all you’re signed up. So, let’s talk about Peter a little bit. So, so I I’m like an early adopter of Open Claw, and um it’s it’s it’s it’s a little bit janky, but like I’ve it’s done so many things for me. Like the other day, because it has memory of our conversations, it gave me like a a a very vulgar pep talk for like 3 minutes, and I was like it was like the most insightful thing that I I’ve ever heard from a from AI. So, so
29:25 — like how how are you guys integrating Peter into the team, and and like, you know, this like personal agent vision? Is Is that part of like what Open Claw is doing, or Yeah, how how how do you guys think about that? Two things there. I mean, I I can only share share so much here, but um the first thing is that he is a ultra ultra power user of Codex. And like, you know, Open Claw was very much built with Codex. And so, he um is just like energizing the team with like feedback and like basically work to like help improve Codex. That’s his side job, but
29:55 — he is doing it, and we’re really excited about it. And the other stuff that I can’t share super much about yet, but he’s like really just like helping us build the like the next generation of personal agents, but like into ChatGPT. Yeah, that makes sense, dude. Yeah. What I find fascinating about what Peter has done is like, obviously, like I’ve known Peter for a while, and uh everybody has like seen this glimpse of the future when they start to play with Open Claw. But what I find amazing is like Peter had seen this vision for quite some time, and if you rewind all of 2025, he
30:25 — has built more than 40 open source projects last year, but all of them kind of align with one vision, which was I need a a command line interface to access my calendar. I need a command line interface to access my tweets and my Gmail. And by By all of these projects, he effectively made like, you know, this vision manifest around this idea of skills and command line tools that we use coding agents for today. And it’s not going to be just coding agent, it’s going to be like any kind of personal agent. And so,
30:56 — uh Peter is going to be fantastic in giving us feedback along the way for having built all of these tools that are now part of the open core ecosystem. Yeah, I’m like really like he’s just one person and he built this awesome open source community and yeah, it’s it’s like maybe like it’s maybe not what you open any other app anymore. I I just talk to my little bot. So, it’s a huge difference. Wait, what what are you having it connected to? Do you have it connected to everything? Uh I have it to I have I have it connected to many thing thing things, man. Like it has my banking information, it has like uh my YouTube information, it has like uh voices that I’ve activated, my my
31:27 — calendar, my Google stuff. And and yeah, like um yeah, like my my you know, when when I’m in bed and I like my my wife wife is like, “Who are you talking talking to?” I’m like, “I’m talking to my open core bot, you know?” Yeah, it’s giving me ideas. Yeah, but but like it it is true that there’s a lot of like uh grifters out there charging $5,000 to set up open core. So, like yeah, if you guys can make it like mass market friendly, that that’s a huge that’s going to be huge, you know? So, We’re on it. We’ll We’ll report back.
31:57 — Yeah. » Yeah, yeah. Okay, well, let’s wrap up and talk about like some of your hot takes, Alex. Like um and maybe I’m making this up, but like I I believe you said something about like how most people most teams don’t need any PMs anymore or something like that. Like the Well, let’s make it spicier. What do you think, Ben? Do we need Do we need PMs? I think what I find amazing with these tools is that like it’s not even just PM or no PM. In my view, it’s like almost like every career ladder is starting to blur, you know? It was like you have a designer here, you have an
32:27 — engineer here, you have a designer like a PM here, and maybe you have a ratio of sorts of like the golden ratio of all. But like now, if you’re to engineer, sure you’re like more productive, but if you’re a designer, you have some superpowers to like become more technical. If you’re a PM and all you did before was like writing strategy docs, well, now you can just prototype. It doesn’t mean that you have to be responsible of that feature for billions of users, but sure enough, you can show your team a a glimpse of that vision by like
32:57 — actually like building it. And so, I think like that’s that’s what I find fascinating to me. It’s like all of the lines between career ladders are blurring and we’re all builders all together. Yeah, this resonates. Um Yeah. Okay, so I I think I don’t think I’ve I’m trying to remember like what have I said? I feel like that’s somewhere on the internet I said that I think it’s a red flag if a startup has a PM when it’s like less than like 20 engineers or something. Maybe maybe I said that. I think
33:27 — like kind of like what you said, like all these roles are blurring together. Right? Like a designer can do more engineering and an engineer can do more design. Um a PM can do more building. Um but you know, also engineers often they need to be focused, right? So, a lot of why they aren’t like, I don’t know, triaging tasks or doing some other kind of like the project management side of PMing Mhm. Um might be because they just need to spend time coding, but now that that’s really easy, you can just ask an agent like Codex to like go like analyze the feedback and prioritize, you
33:57 — have more time. And so, I think everyone’s able to do everyone else’s jobs and like Scott Belsky has this idea of like collapsing the talent stack. Yeah. » I like that idea. I think it is happening. I think the fewer people you need in a room to do anything, just the better that thing goes, the more pure every decision is. Yeah. » Um so, then the question is like well, what is what is left for PMs? And I think that there are many PMs who should actually convert roles. Right? Like if you’re a PM who kind of just like always wanted to be an engineer, but maybe you just like you were very
34:27 — good at managing people, but you were like not that good at engineering. Like maybe now you should become an engineering manager. You know, and with a coding agent, like that’s fine. And maybe that’s just a cleaner role for you. I think there’s an analogous version where like a different PM might just want to be a designer now. Yeah. You know, just be closer to building. But I think ultimately what it comes down to is interest. I think interest and agency are like the most fundamental qualities that remain important for humans in a world with AGI. And so I for me that’s kind of what I
34:57 — end up thinking about. Like, if you fundamentally are more interested in writing code and like you just did PM work because it was like someone needed to do it, now you should be delete yourself and become an engineer and just do the same thing from an engineering standpoint. Same for design. But if you are like fundamentally like most interested in like spending a lot of time with users, even if it takes you away from building, right? Or like trying to look around corners and understand where the market is going, etc. And And if you are in a large enough team where there’s already enough
35:27 — engineers, then I think maybe there’s still room for a PM there. Um but yeah, I think it really comes down to like what are you most interested in? Okay, and maybe I’ll add one thing, which is like I still think every problem needs a human that’s accountable for the problem area. But I just don’t think that that human has to be a PM. Yeah. And I think it depends a ton like what you said on the nature of the product, right? Because we were lucky enough here to work on Codex, which is very much like a builder developer product. And we are the best users ourselves and we pair with the community thanks to this open
35:57 — source thing. But even if you rewind like a decade and when I was at Stripe, like Stripe reached 250 employees with zero PMs. Even without any AI tool. Why? Well, Stripe was just an API and like we were all engineers and we knew what a great like API would look like. So we were building the API we always dreamed about, right? Like we wanted Stripe to be so elegant, just a few lines of code. If you’re If you’re working in a different field and and you want to carry that customer obsession, you know, you you may need more PM time
36:28 — uh to spend time with customers when when the vertical or the industry, the space, the the problem space is different. Uh we’re lucky here to work on Codex and and kind of like building the tool we’ve always wanted to have. Yeah. But like let’s say you know in this example where it’s like maybe you know it’s it’s a it’s a field or the product is serving users that engineers have less intuition for. Like PM is just a label for like someone who can design and code. The space is most interested in use is
36:58 — very interested in users. You know what I mean? You could equally well have an engineer who’s very interested in users. So I think these labels are kind of losing their meaning. Yep. A little bit. But but that’s okay for now. It’s like it’s just a bit messy. Yep. That’s what I found on my team too. I feel like the best engineers like like the best engineers don’t ask me like hey Peter, what should we build next? Like they go off and talk to users and figure out what to build and like that we have a conversation about it. It’s like it’s kind of like everyone’s on the same page around this stuff, you know? That’s how I would the Codex team works a lot, right? Like so many features that are like that you’re using today with Codex app came from great ideas from
37:28 — engineers, completely bottoms up cuz they wanted the feature for themselves. Yeah, but I mean I would say I don’t know. Like I think okay, there’s a very strong profile of engineer that I love working with who’s just like loves like hanging out with users and thinking about what to build. There are equally well incredibly strong profiles of engineers who just like are insanely fast, insanely good at building systems and thinking them through and like have zero interest in hanging out with users. And I think there’s plenty of room for those people too, right? Like again, that’s like kind of my view of like this world with AI is we can all just become more opinionately ourselves. You know what I mean? Like
37:59 — like be yourself. Like AI and the team around you maybe will like just like fill in your you know what you don’t want to do. That’s a great way to put it. Dude, but I do think like the builder label is very important. Like I feel like every PM wants to be a leader and like dude, the traditional career ladder is just like you become like this VP or something and then you don’t have time to build anymore, man. You’re just like in product reviews all day. Like just like giving some feedback here and there. And I feel like a lot of PMs don’t want that, dude. Like I I I don’t know if you you want it, but like I I want to stay close to users as you actually ship.
38:30 — Yeah. Totally. Yeah, I mean I don’t I don’t actually view PM as a good leadership position. I view it as a fill-in-the-gaps position. Occasionally that might require leadership. Although even then the leadership is probably just like helping people get aligned and less like being some genius who came up with the right strategy. Yeah. Um I will say one thing for sure like PM but the best PMs at Open AI are incredibly in the weeds. Um and I think joining Open AI in a senior leadership position is like very challenging because it’s actually still important to
39:01 — be in the weeds. So you somehow need to find time for like uh you know, a senior leadership, but also how to get in the weeds at the same time. So always better I think here to join directly in the weeds. Yeah, yeah. Okay, cool. Last last last question, man. So so you finally hired another PM. I I think his name is Rohan or something, right? Um and and like what kind of qualities do you guys look for when recruiting people to the Codex team? You know, other than needing to be Codex power users, like what what kind of qualities? Yeah, let’s let’s both take this. Uh yeah.
39:31 — I think I mean look, I said it before. I’m going to go back to agency. Mhm. Like people who do things is like literally the most important thing. Like oh Open AI and also especially Codex team like we’re intentionally not a team where like you’re going to join and it’s going to be like, “Hey, here’s like 12 tasks to do in increasing order of difficulty and like go for it.” It’s going to be more like, “Welcome.” Okay. That’s just it. It’s like, “Welcome.” You know? So I think people who uh are self-starters and do things and have like energy and ideas for what to
40:01 — be done um and don’t mind disagreeing with the existing ideas cuz they’re probably wrong cuz we probably made those decisions by accident. Yeah. Um and who will like Now I’m just describing like perfect teammate, right? Who will like absorb like any incremental like scope or accountability for things that are unknown um I think is ideal. So like that is that is sort of the broad meta things and then I think if you’re just trying to think about what role makes the most sense like fit anything technical engineering. Yeah. Yeah. Agreed. On my side like on developer experience what’s uh
40:31 — what I’m looking for is usually like people who have high agency obviously very technical obviously mastering the tool like Kudex but I also have this like passion for like spending time with developers and builders and like sharing their knowledge you know? We just announced this week for instance that like Thomas uh who built the the open source Kudex monitor uh is going to join my team this month and it’s great because someone who’s like very creative very prolific with Kudex but also loves sharing how he’s
41:01 — building with it you know and it’s kind of like we we need to bring uh millions of developers to this bright future of Kudex. I think like agentic coding is changing everything in terms of like how we’ve been always reflecting on how we build software and build apps and products. And there is so much potential to show the world like that anyone can build anything and like teach them along the way. So that’s kind of like what I’m looking for. I I as let me know if this is wrong. In my head the role description for DevEx is like
41:32 — very good engineer who is also very good at Twitter. Yeah you could you could say that. Okay I got I I got half of that. I got half of that I don’t have the other half. The little asterisk I will add which is that like where is like outstanding for our communities here and when you travel in some parts of the world some developers are not as much on there um like in Europe and some other places they use LinkedIn or they use some other places. So we just have to have this little asterisk of like thinking about the worldwide developers. » Good on socials okay. Uh but good on
42:02 — socials for sure. Yeah and love spending time teaching and educating. Yeah. And I I feel like agency you can kind of tell even before they go through the interview process right? Like are are they are they shipping stuff online? Do you have like » Exactly. side projects? So like » like you know interest in like working together for me it’s like is there a link? If there’s a link I always click it. Like you know I guess I maybe I look to see if it’s like a a bad link but no I pretty much always click it and I’m always curious. And then if there’s like some spiel with
42:32 — ideas I usually always read that and then I don’t know how bad this or toxic this is going to sound but if it’s like some explanation of like why they’re interested in the role and like you know their CV and stuff like I’m much less likely to read that than like their ideas and and what they built you know. And then I I never I don’t someone asked me this the other day and I I realized like I had no idea where like people went to college. You know. Who cares man? Yeah who cares? Like I’m so glad we live in a world where all these stupid credentials don’t matter anymore. Who cares? Fan college? Just
43:02 — like show me what you what you built. Yeah. Cool guys. Well thanks thanks so much man. I I I love Codex and and yeah I’m I’m going to buy products and stuff this weekend. It should be good. » Great. Have fun. Wait to see your feedback. Thank you so much Peter for having us. All right guys. Thanks.