Podcasts
Paul, Weiss Waking Up With AI
Good Vibes Only: Coding With AI
In this episode, Katherine Forrest and Scott Caravello cover new AI model releases and take a deep dive into vibe coding, explaining how natural-language-driven code generation is reshaping software development and why security oversight remains essential.
Episode Speakers
Episode Transcript
Katherine Forrest: Hello everyone and welcome to Paul, Weiss Waking Up With AI. I'm Katherine Forrest.
Scott Caravello: And I'm Scott Caravello.
Katherine Forrest: And, so, Scott, you know, we're hitting your six month anniversary for the podcast.
Scott Caravello: That's right, that's right.
Katherine Forrest: And, it appears that you're a fairly popular person on the podcast. I think one of the problems, for me, is that you've overtaken me.
Scott Caravello: Well, you know, I appreciate you saying that, but we all know you're the real talent here, right?
Katherine Forrest: No, no, no, no, no, no, you've got that like laughing-little-voice there. But OK, so as a result, you have to go through a little bit of pain now that the podcast is so successful. So, I just want to ask you about the success of the Mets.
Scott Caravello: Oh… how much time do you have? But no, right, that is how I started off the first episode I was on, I think. It's pretty rough out there, Katherine, you know, tied for the worst record in baseball.
Katherine Forrest: Yeah? Yeah, tell me about that. Tell me about how bad the Mets are. I'm a Yankees fan. I'd like you to spend a few minutes about how bad the Mets are. What's their pitching like?
Scott Caravello: Pretty bad, but it's really been the offense at that’s been brutal. I think they've scored less than two runs in 10 of the like 28 games and they—
Katherine Forrest: How's their closer?
Scott Caravello: I actually… well, “are you just saying because it's Luke Weaver?” No, no, I'm not.
Katherine Forrest: Have they needed a closer?
Scott Caravello: No, nine and 19, they lost, that they had a double header against the Rockies yesterday, lost both and scored one run across the game. So, you know, 2027 is just around the corner.
Katherine Forrest: All right. So that's what you get, Scott, you know, success on the one hand and… no success on the other. Hey, so we've got a couple of things going on today and we're gonna briefly talk about two models that dropped last week, as they say. I don't know if they talk about like “dropping” a model. Maybe they don't drop a model. Maybe you like drop a video or something, but you don't drop a model, right? It sounds like it might be breakable or something.
Scott Caravello: I think that's right. I think we can just go with release. That works. Yeah.
Katherine Forrest: Okay. So, we've got the double model release last week, and then we also have our topic, which is one of the ones that I had wanted to do for some time, which is to explain to folks who are hearing the phrase “vibe coding,” vibe coding, explaining what that is, because it's not necessarily intuitive, but it does sound like it's supposed to be so easy to understand that people are embarrassed to ask: what is it?
Scott Caravello: I agree. And, it's also worth getting into just because how much vibe coding has actually changed since that term really, like, came onto the scene a year or so ago.
Katherine Forrest: Yeah, it is that reason actually. It's only about a year or so. Let's just, sort of, do a brief flyby with these two models and then we'll come back to them. We're recording this episode on April 27th and last week OpenAI released its GPT 5.5, nicknamed SPUD, and two weeks ago, from today, Anthropic released its Claude Opus 4.7 and, as you can expect, you can get the system cards for both of these online. Oh, I want to do one more! Hold on. I want to do one more. DeepSeek released its V4. So, we actually have three. So, we've got the GPT-55, the Opus–Claude Opus 4.7, and the DeepSeek V4. And, so, there are system cards for all of these. And, you know, we're seeing improvements across the board with these highly capable models. And, you know, in terms of like, for instance, one benchmark for agentic coding, which is coding for agentic AI, and with complex software engineering tasks, the Claude Opus 4.7 actually increased over almost actually 9% over its Claude Opus 4.6. So, that's a big jump.
Scott Caravello: Yeah, then GPT 5.5 actually, I don't think it's scored as high on that specific benchmark, but it did hit a leading score on another popular coding benchmark that requires planning and iteration and tool coordination for coding, and that came in at about 82.7% accuracy. And if you really want to drive home just how impressive that number is, that's actually a higher accuracy score than what Anthropic reported for its Mythos preview model. That's been all over the headlines and that we covered on a full episode just a bit ago, which hit 82.0% on the dot.
Katherine Forrest: Right. You know, we've got a lot more to come on these new models. And we'll also talk about DeepSeek’s V4. We'll come back to these, you know, in the next week or so. But let's move on to vibe coding, because actually, interestingly, not some of these models necessarily, but some of the tools that go with some of these models are vibe coded. So, let's get into it.
Scott Caravello: Sounds great. So, vibe coding, as I think as you previewed, Katherine, a lot of people might've heard of the term, but really what does it mean? What's the point of it? And why is it, sort of, taken over software development over the past year or so? And vibe coding was actually named Collins Dictionary's Word of the Year for 2025. Although, you know, I always find it kind of strange when they say something's the “word” of the year when it's two words, but that's… neither here nor there.
Katherine Forrest: Okay, hold on, hold on. I didn't even know there was a word of the year.
Scott Caravello: Really?
Katherine Forrest: Like, who knew that there’s a word of the year?
Scott Caravello: Well, so I think it's a sort of thing like, you know, when you're in the elevator in our building and it has the screen with the little facts or whatever.
Katherine Forrest: Captivate.
Scott Caravello: Yeah, exactly. Almost always you'll find the dictionary word of the year up there at some point in the fall or winter. So keep your eyes peeled.
Katherine Forrest: Wait, are you one of the people who actually reads Captivate?
Scott Caravello: Oh, absolutely. I mean, there's no shortage of interesting and bizarre stuff that gets put on there. I am the audience. I'm the target audience.
Katherine Forrest: Including Word of the Year. Okay, all right, okay.
Scott Caravello: Exactly, exactly.
Katherine Forrest: So, vibe coding was named the Collins Dictionary Word of the Year for 2025, and apparently it may or may not have been on Captivate.
Scott Caravello: Correct. I wonder if there's someone I can email to find out. But, anyway, it's something that Merriam-Webster has also picked up as a slang entry. But vibe coding also raises some significant security issues that are worth talking about as we get into what it's really about. And, Katherine, I know you had done an episode on AI encoding last summer. But as I sort of previewed, the landscape has really changed a lot since then. So, it feels like it's time to come back around to the topic.
Katherine Forrest: Absolutely. And, in that episode, we talked briefly about vibe coding. And I think we actually referred to it as the Tik-Tok-ification, TikTokification, of coding, where suddenly anybody could do things in terms of coding, they could actually code things that previously had required real expertise. And that's in fact what happened. So, let's start with some of the basics for those of you who didn't listen to that episode, that extraordinarily exciting episode, from last year. So, the term “vibe coding” was coined by this person named Andrej Karpathy, who was an OpenAI co-founder, and also he was a founding engineer behind Tesla's autopilot program. And I love my Tesla self-driving functionality, as you know, use it all the time. And in a post on X back in February 2025, he essentially invented the phrase vibe coding. And he described it as a kind of coding where you, “fully give into the vibes. Embrace exponentials and forget that code even exists.” Now, by the way, I just want to tell you I don't think I've ever given into vibes. Right, like that's not me, Katherine Forrest. I don't know that I've ever given into vibes. All right, and I don't even know what embracing exponentials means and I have often forgotten that code even exists. Okay, so I just want to put that out there for the audience, but that's the way this was described. And he was saying essentially that instead of writing code line by line, like a software engineer, you would describe what you wanted in plain language through an AI query bar, essentially. And the AI model that you were entering your query into would essentially generate the code for you. So, you know, you could essentially accept the code in detail as you received it, or you could, you know, add on to it, revise it, reframe it, et cetera, et cetera, et cetera. And then what a lot of people do is once they've got the code all written out, they would just sort of paste it, you know, sort of cut and paste it into another application.
Scott Caravello: Right. But, so, that's still a fundamentally different approach to traditionally building software in traditional programming. After designing the code, a developer writes specific commands, understands every piece of logic, and manually works through and debugs issues. And, so, with vibe coding, though, the developer becomes more of a manager or director. And I think there's actually another quote from Karpathy that drives that point home. And he said, “it's not really coding. I just see stuff, say stuff, run stuff and copy paste stuff. And it mostly works.”
Katherine Forrest: All right, okay, well those words I actually understand, although, it's really pretty impressive that that… how did you pronounce his last name? Kar-path-y?
Scott Caravello: Karpa-thy.
Katherine Forrest: Karpa-thy, Karpa-thy, okay, I was saying Kar-path-y. Our audience will forgive us. But, you know, to say, I just see stuff, that actually is interesting because it's suggesting that he actually is visualizing what it is that he wants to create. And then he says stuff, which is he's actually putting into the query bar of the AI tool what he wants. Then it's creating the code, he's running it, and then he's copy and pasting that code, and then he says it mostly works. So, it's really worth noting that Karpathy—how did you pronounce his name again?
Scott Caravello: I said Karpa-thy, but I think, you know, now that you're really drawing out the difference, like, it's got to be Kar-path-y, right?
Katherine Forrest: Well, I don't know, maybe one of our audience members will tell us, but it's worth noting that he originally described vibe coding as something that was best suited for throwaway weekend projects, but it has turned into so much more than that. There are businesses that make significant portions of the code that they use to run various functionality assets with vibe coding and look at Moltbook, which we spent two entire episodes on. That's a great example of vibe coding. And you know, the phenomenon now is way past weekend work and deeply embedded in coding generally.
Scott Caravello: Yeah, and I think maybe it would be helpful to share a couple of statistics that make all of that clear. So for example, take Y Combinator's most recent cohort of startups.
Katherine Forrest: So, talk about Y Combinator. Tell us what that is.
Scott Caravello: Y Combinator is an accelerator and venture capital fund. So, basically they have these cohorts each year, which are a group of startups that they provide funding and support for. And, so, the code bases for a reported quarter of the startups in their most recent cohort were almost entirely AI generated. So, that's really significant. And then Google CEO said in a blog post, not long ago, that today 75% of all its new code is AI generated, which is up from 50% just last fall. So, not that long ago, that's a pretty staggering increase. And, then, finally, Anthropic, who we talk about all the time, even at start of this episode, has essentially said that the majority, about 70 to 90% of its own code, is now written by Claude Code, its agentic coding tool. And the head of that product, Claude Code, Boris Cherny, confirmed that Claude Code wrote all of the software behind Anthropic's Cowork project, which is an agentic tool used for a lot of knowledge work. And that was done entirely with vibe coding and under two weeks. So, that's Anthropic using its own AI to build more of its own AI.
Katherine Forrest: You know, it's actually, I would love to really understand how that vibe coding was done because it was so sophisticated. And I think of vibe coding as really what Kar-path-y, or Karp-athy, was saying, which is, you know, you imagine stuff, you come up with it, you conceptualize it, you either type it into your AI prompt area or you say it orally and use the microphone functionality and then it gets created. But, it was really so much, it's got to have been so much more than that when you're using Claude Code to write Mythos, right? I mean, it's just, it's a lot. I mean, I don't know how iterative it was. That would be really interesting to know how frequently Claude Mythos came back to the developers and either presented itself and the developers then iterated it or just whether agentically it did a lot more just from the get-go.
Scott Caravello: Yeah, but just to that point, right? I mean, it shows how much the use of AI coding features is accelerating because there's just no way to build and develop something that advanced, that cutting edge without, you know, both that incredible increase in capability and also just much more of an investment of time and resources than that casual project that the name originally implied. And so I think, Katherine, we could talk about how Karpathy had gone on to describe the phenomenon going from vibe coding, which he described as already passé, to agentic engineering, which is where we are now.
Katherine Forrest: Agentic engineering with a process that has some similarities to what we've been calling vibe coding, but actually has more sophistication built in. Sohe addressed that evolution actually in February 2026, almost exactly a year after his initial post. And he said that vibe coding with large language models had gotten so much smarter that it was increasingly a default workflow for professionals. And, so, that's where his agentic engineering phrase came in. So, that was just in February 2026.
Scott Caravello: Which really, really shows how much better the coding capabilities have gotten to go from that in vibe coding in 2025 to that agentic engineering in 2026.
Katherine Forrest: Yeah, you know, and I want to pause on that term, agentic engineering, because I assume that, you know, when we used the phrase vibe coding with Mythos or Mythos the other just a few minutes ago, it really was agentic engineering because the words in that phrase tell you something useful about how the vibe coding has evolved. It's agentic, number one. The engineers are no longer directly writing much code. Instead, they're overseeing a bunch of autonomous AI agents who are doing it instead. And the engineers are people who might be humans, who might be directing the code, or what I'm going to call steering the code, in certain ways and overseeing the agents, but they are really only barely providing any oversight and the code, the agentic, and there's like little agentic engineer bots that are running around on the inside creating this stuff.
Scott Caravello: Yeah, absolutely. And then I think also just to go back to one or two more pieces of stats to sort of illustrate it, we can look at Anthropic's internal research again, which in a December 2025 study, it had surveyed 132 of its own engineers and found that employees reported using Claude in about 60% of their daily work with a self-reported productivity boost of around 50%, which is about double what it had reported just a year earlier.
Katherine Forrest: Well, it's a huge productivity boost. And the Claude Code usage data show that the AI was autonomously handling more and more complex tasks. Just six months earlier, Claude Code could typically complete about 10 actions before it needed human input. But by the time of that study, the one that you'd mentioned, the December 2025 study, it was completing around 20. And so it's doing more and more on its own. And some people measure it in terms of sort of minutes or days or now even weeks. You know, between December 2025 and today, we know that that's got to have improved even more.
Scott Caravello: Oh, totally. But, you know, it's also not like it's only the tools from the major AI labs that are driving the shift.
Katherine Forrest: Right. There's a bunch of popular startups contributing to the market for AI coding. You know, Cursor is one of the leading AI coding startups. It's raised over $2 billion in November 2025 and tripled its valuation to almost $29 billion. There's another company called Lovable, which is a vibe coding platform out of Stockholm. And it actually hit $6.6 billion of valuation after raising an additional $330 million in its last raise in December 2025. So, these are big numbers.
Scott Caravello: But, and you know, because there's always a but, with all of this momentum there comes risk. So, Katherine, do you want to say a little bit about what those risks are?
Katherine Forrest: We've talked about sort of issues, I think, with coding before and using AI to code before. And that security is the number one risk. And it's a concern that's pretty easy to grasp… when you've got non-developers or developers who are increasingly generating entire chunks or even full applications using a natural language prompt and actually using a model that may be trained on snippets of code that it's getting from other places and it's where it's learned to code from other places, it may have embedded errors. It could actually have an embedded error from the code that it was trained on or it could have an embedded error just because it happens to put the code together wrong itself. So vulnerabilities can slip through.
Scott Caravello: Yeah. And, you know, I know we talked about how it's gone from that casual vibe coding to agentic engineering with sophisticated teams running, you know, multi-agent systems, but that isn't meant to distract from the fact that you do still have plenty of people who are building apps and products in that casual way, who might not have the typical software engineering expertise. And there that term vibe coding is very much the appropriate one to use.
Katherine Forrest: Right, and the data on this, which comes from a couple of different places, paints a pretty interesting picture. So there are various studies, but somewhere between 40 and 62% of AI-generated code has been found to have some kind of security vulnerability. That does not mean that it's a vulnerability that would really expose the business to a great deal of risk. It could be something that's very minor. But it does mean that you've actually got to really test, try to find the bugs and then patch. You can't just, you know, vibe code and then let it go. So there are some researchers at Georgia Tech that launched what they call the “Vibe Security Radar” to track vulnerabilities that are related to certain vibe coding tools. And in March 2026 alone, they assert that they've found at least 35 new common vulnerability and exposure entries. Those are called “C”-“V”-“E” entries that were the direct result of AI generated code, which was up from six in January. So again, I don't want to overstate that because vibe coding is extraordinary and it can provide businesses with such valuable software. But it does suggest that you've really got to look for these security vulnerabilities.
Scott Caravello: So when we tie all this together, I think that puts together a pretty clear picture for us that, on the one hand, vibe coding is absolutely democratizing software development and bringing costs down and really, really fantastic, fantastic technological advancement. People who never would have been able to build applications can do it in minutes and days. And that is exciting. But on the other hand, you know, when done casually without expertise and the right review procedures, it can create some issues.
Katherine Forrest: And so for organizations and messages, AI coding tools are powerful, they're capable, they can do a lot, create great efficiencies, but pair it with a security review, pair it with the governance framework, all of that critical human oversight is really necessary. So, I think that's about all we've got time for today.
Scott Caravello: Well, I'm Scott Caravello. Katherine, until next time, don't forget to fully give into the vibes.
Katherine Forrest: All right, well, I just want to tell you that I hope that the Mets have a need to really use their closer soon.
Scott Caravello: Well, you know, on the flip side, maybe it'll be easier to get the firm tickets this year. So, we gotta find a silver lining.
Katherine Forrest: All right, signing off. We'll see you folks next week.