← Back to analysis

Transcript: OpenAI Image 2 is Nuts. Here are 10 Ways to Use it.

0:00
So, OpenAI just dropped a new image model called CatchBT Images 2.0 and it is really good. Everything you're seeing right here in this video, all of these images were generated with this new model and it is insanely impressive. It's really good with text. It's really good with realism. It's so good in fact that it actually has been ranked the number one image generation model and it beats Nano Banana 2 by 24 points and there's never been a gap this large at least according to arena.ai. Now, this is crazy if you just look at it. So, Nanoban 2 is the one that I use for everything. So, what I wanted to do is actually get my hands dirty and compare
0:31
these two models. I've got tons of different matchups here. So, I'm going to show you guys which one I think is better for which use cases. And then we're going to dive into some specific GPT I2 use cases that you can go do right now. So, let's not waste any time and just get straight into it. We have 30 different matchups. We fed them both the same prompt and we're going to put them side by side and you guys can help me decide which ones you think are better. But, funny enough, you know, GBT is OpenAI, Nano Banana is Google. So, I'm going to have Claude Opus 4.7 acting as the judge and telling us which one it thinks did best at each of these categories. So, let's just get straight into this deck here. All right. So, for
1:02
all of these, GBT Image 2 will be on the left and Nano Banana will be on the right. So, this right here was a vintage 1960s movie poster for a film called Neon Shadows. If I was to choose a winner here, I would be taking GBT Image 2. All right, this next one is a clean, modern infographic titled AI image model comparison. Once again, I would be going with GBT Image 2. This one just looks a little bit cheap. Even though the text is all fine over here, this one looks like kind of a template. And this one feels like it was actually built for us. Okay, here we have a premium product label for a coffee bag that reads Ethiopian. I'm not even going to try to
1:33
pronounce that. And honestly, here like this is one where I might lean towards this one. I just don't think that there's enough of a difference yet to pick, at least in this specific scenario. So, they're both good. I would probably pick Nano Banana 2 here. Okay. A candid photo of a 35-year-old woman with light freckles sitting in a cafe. I'm going to go with the left side. I mean, this looks so much more real. This looks like something you might shoot on an iPhone, whereas this looks like it sometimes it just looks too good. Like the lighting and the blurriness and it just looks a bit too good. So, I would be choosing this one over here. Okay. So, I'm going to speed this up a little
2:03
bit. You guys don't need a full breakdown. Let's just fly through these a little bit. I I'd be taking this one. Like I said, it just looks more real. This looks too perfect. Professional head shot. Honestly, I think these are both pretty solid. I think it depends. Like I said, these are both pretty good. Now, this one's tough because they both sort of look, I don't know, like fantasyish. And if I'm going to go with which one I think looks more real, I would probably choose this one over here. The lighting over here, it just looks too edited. I guess this looks like something that was, you know, color corrected. And this one looks also color corrected, but a
2:34
little bit more natural. Okay, this one in my mind is just a no-brainer. This left one looks so much more realistic. Even though this one is solid, this looks way more realistic. So, GBT image 2, as you guys can see, is just crushing it. These are very, very similar. I honestly think I would take this one. I don't know why. My gut's just telling me I'd lean this way. Okay, same sort of thing here as the last product one. I think these are both really really solid. So, hard to choose a winner there. I think once again, these would be very similar to me. Although, the physics over here don't really make
3:04
sense. It's floating and the uh the insole is also floating out. So, I'd pick GBT Image 2. These are both really solid. I think the lighting on both of these looks really good. Um, as far as the actual watch, which one would I rather wear? I'd probably rather wear that one, so I'm going to choose that one. A realistic mobile screenshot of a clean banking app dashboard. I mean, these are basically identical. I think the only thing where I would give Nano Banana 2 the edge here is that it was able to use web search and it pulled the real logos over here, which I do think is a plus. A clean labeled physics diagram of a simple pendulum. Okay, that
3:36
one's hard to tell. I don't I don't know. I'm not picking one there. Um, a realistic SAS landing page. This one's also kind of tough. I think I might have to go with the OpenAI version. This one just looks a little bit more vibe coded. These are basically Okay, I'm not I'm not even going to choose there. Um, oil painting. This one I think I would choose. I'm going to start to do these a little bit more lightning round. O, I'd pick this one. GBT image 2. I can't choose here. I don't know what I'm looking at really. Um, these look This one looks better if
4:06
we're talking like a kind of a watercolor type of thing. I don't know. I would choose that one. Yep. I think I would go with GBT image 2 on this one. Again, a single image showing the exact same person in three different outfits. I mean, this one's closer up, so the details are better. So, I'm not going to really choose a winner there. This one, I think interesting. Honestly, for this one, I would choose Nano Banana 2 because on this side, the lighting and all the features just look too perfect. Whereas, the lighting over here seems like it's dynamic for the actual scene. Okay, 30 was a lot. I'm just going to click through these. I don't think you
4:37
guys need to hear me talk anymore. These are all I mean they're both very good, but you can kind of see where we are with these two models. So, it's good to see them side by side. And let's see what the um Claude Opus 4.7 said. And by the way, if you guys are curious how I built this whole deck right here and the one that you're going to see later in this video with all these different use cases, then you can download the repo for this Cloud Code project, which actually looks like this. I pretty much automated that entire thing. I set it off on the task. It generated all the images, all the tests, created that entire presentation in two different
5:07
separate local hosts. I'm going to drop this whole project as a GitHub repo in my free school community. The link for that is down in the description if you just want to check it out, look at the images yourself, and also figure out sort of like just the way I built it. But anyways, the free school community is linked in the description. You'll come here, you'll go to the classroom, go to all YouTube resources, and you'll find everything that I've ever dropped from YouTube in here. So, thanks guys. I will let you get back to the video. Now, it said that GPT image 21, we have different categories. We have artistic styles, character consistency, complex scenes, diagrams, and UI. If you guys really care, you can pause this real
5:37
quick and take a look. You can see that they were just basically judged on Thai, NanoBanana or GBT Image 2. And GBT image 2 ended up winning more. So that is what Anthropic thinks of these two models and also pricing. So I use Nano Banana and GBT image through key.ai. It's basically like an open router for models. Here you can see you can get image models, video models. It's just like one key and you get all of them. So Nano Banana 2 has different kind of pricing for the quality if it's 1K, 2K or 4K. And it
6:08
goes up from 4 cents to 6 cents to 9 cents. So 46 or 9. And then right now GBT image 2 is just 6 cents flat per image. So do with that information what you want, but they're roughly the same price. And you can also get them for cheaper through key.ai. So that's what I use. All right. So now I've got some use cases for you guys. And I know I said 10, but I ended up just kind of throwing in a few more because there's so many things you can do. And there's so many more. There's probably 50 others I could come up with, but I just wanted to show
6:38
you guys a few things that you might be inspired to try. So, the first one is pitch ready product packaging. So, I've got three versions basically for all of these. This one is cereal box. Pay attention to the calories over here, the nutrition facts. Pay attention to the the barcode, all of the text. It's all really, really good. I don't see a single mistake. And that was one of the biggest problems with AI image generation. You can see here, same exact thing. All of this, the shadows, the labeling, the text, you know, the depth, and then same thing right here with this pill bottle. So, that was use case number one. Number two is pretty cool.
7:09
So, I was inspired by this user on X. His name was Nick, and he posted a picture. Basically said, hey, I took this image and I gave it to GPT, and I said, "Hey, you know, basically make this into a scan and get rid of the creases." And if you look at it, you can see that it actually matches his handwriting like exactly. and the creases are gone and it even cleaned up some things. So, I think that this is super impressive, especially when you look down here and I think this is physics and you start to see like different formulas and like just messy characters. It was able to match it. It
7:39
even got rid of this little red stroke. You don't see that over here. So, then I tried this myself, of course. I uploaded this piece of paper that I crumpled up a little bit and it matches my handwriting basically perfect. The only mistake I saw here was right here. This was supposed to be an arrow and then a three, not 33. But that's like the only mistake that I was able to find. So very impressive. And then we have this handwritten whiteboard brainstorm. So obviously this looks a little too perfect, but you can see the shadow on the whiteboard is really good. You can see here this one looks a little bit more realistic, I'd say. But all of
8:10
these are very solid. We also have doing website design. So this one has a kind of like a SAS. It's a full website hero section. And for some reason right now I'm not able to change the aspect ratio. So it's basically always coming back square and I can't really control it. So this would probably look a little bit better if it was landscape, but this one looks really nice. I mean, if you were able to use this inspiration and feed that into like a claw design, it would build it out really nice. And then similarly here, I think that this one is also pretty beautiful. So it should be good at helping you design website concepts. All right, so add creative split tests. I think this is super cool.
8:41
I mean, it understands the spacing. It is engaging. Maybe you want to get a little bit less wordy, but still. This one looks really good. This one also looks really solid. And this one, I mean, this one has the red, which I think would kind of stop the scroll a little bit, but all of these different variations are pretty cool. Now, we also have UGC Selfie Ads. So, this one is obviously very solid. It is the um little serum. This one is Cedar and Sage. It's We've saw this product a few other times. This one, I think the man doesn't look as realistic. His skin
9:11
looks a little bit maybe too smooth. This one looked a lot better, I think. And here, I think this one looks really solid as well. The lighting and everything, all the wording. Now, we also have the localized creative. So, we saw the cedar and sage earlier, but now it can be translated. And I think that this ad itself actually looks really solid, too. So, I'd probably say, "Hey, make this in English as well." Um, this one looks really, really good. All of these are very consistent with the brand, as you can see, the colors and stuff. And this one also looks nice. We've got book covers and three different styles. So, this book is called The Founders Silence, and we have
9:42
this one as well as this one. Very different vibes, but this kind of stuff could just help you ideulate, right? Like you could just drop in a product and say, "Hey, create me 15 different styles of book covers or of ads or websites and it's just going to give you a ton of outputs." That's really going to help you with like figuring out which direction you want to go. You can also use this for LinkedIn carousels. So here we have, you know, seven pricing mistakes founders make on SAS. And then you'd have all your different slides generated with this image model because the text is good and because it could generate diagrams and charts and things
10:13
like that as well. This one, I don't really know how this got in here. restaurant menu plus food photography. But the menu looks fine. It looks a little bit AI generated, the the handwriting, but this food, I mean, if you put this on your website or, you know, in your reviews or even just on your menu, this food looks super super like just real. We've also got brand mascot. So, if you've got a a mascot and you want to put it across different elements of your website or wherever, um, it should be able to keep that character pretty consistent throughout. Not only logo design, but also the ability to like change the styling of
10:44
your logo. So, we've got the AIS right here, but look, you can make it like a 3D element. You could make it like a plush. You could change it to be sort of like glass. And I did this with other logos, too. So, the Up AI logo, you can see same exact thing. And then for just my logo, Nate, we can see that we have these different styles as well. So, very cool. Also, real estate. So, figuring out what to do with your space. Like this is my space right back here. It's a pretty empty room, right? But here you can see that it keeps my elements that already exist. and it put a new plant. It put a couch.
11:16
It put a different rug. And then it did that with a few other styles, too. So, it's keeping the spatial awareness and it's keeping the elements, but it's just kind of changing up what's in here. We've got some enterprise diagrams. So, it understands logic. It understands flow. It understands how to obviously make text. So, here's just a few other examples of this. Now, I did notice right here we have a little bit of an error here. And I'm not sure exactly if I put click P, we should be able to see the exact prompt what it might have messed up. Yeah. So, I'm not exactly sure what happened here, but it definitely did mess up some of the text.
11:47
Now, I was also throttling this API and um just generating hundreds of images, but if you push it to the limit, it might mess up some stuff. Speaking of messing up some stuff, these thumbnails are horrific. Now, keep in mind, this was literally me giving it a reference thumbnail and saying, "Make a bunch." And like I said, I was throttling the API and stuff. And it's interesting because earlier I did some examples where I said, "Hey, here's a picture of me. make it look like I'm, you know, shaking hands with ex person. And it was making me look like me. But sometimes when you do a source image
12:17
over and over, it just degrades and it degrades. But the reason I wanted to include it here is because if I actually built a workflow around this that had like good elements and it had good prompting for my style, I 100% think that I could fully automate the thumbnail creation. So, I thought that it was at least worth including in this sort of use case type of video. But these are obviously trash. Like even all three of these, I look different in all of them. And then the last one I put because you know it could definitely be useful um is just realistic app mock-ups which we saw earlier in the comparison between GBT Image 2 and Nanobet 2. So
12:48
all of these look pretty solid. But yeah, I mean if you guys are curious how I built those two sort of tests and those dashboards, those were just spun up on local host. Um and I pretty much just had Claude code in here build everything. So, I dropped in what it needed and I had it brainstorm over how we could do this comparison and research what the best use cases would be as far as like comparing them. And then obviously Claude Code itself analyzed and judged all those images. So, it's really cool that I was able to just give it this task and give it this idea I had and then it was able to automate the
13:18
pipeline of creating all those images and inserting them into the dashboards and, you know, creating all that styling and the presentation for me. So very very cool to just be able to push Cloud Code to its limits and just connect all these different tools. But that is going to do it for today. And I know this one was quick, but just wanted to show you guys the new model, compare it to Nano Banana 2. But anyways, I'm going to be using GBT Image 2 a lot more cuz right now pretty much all my thumbnails are are created or enhanced with Nano Banana 2. So this thumbnail that you're looking at for this video, I had built with GPT Image 2. So hopefully you guys think it
13:48
looked good. Anyways, that's going to do it for this one. Hope you guys enjoyed the video. If you did, please give it a like. It helps me out a ton. And as always, I appreciate you making it to the end of the video and I'll see you in the next one.