OpenAI's new GPT-4o image generation is an absolute game changer

GPT-4o’s new image generation is destroying industries in real-time.

Not even up to a week and it’s been absolutely insane — even Sam Altman can’t understand what’s going on right now.

Things are definitely not looking too good for apps like Photoshop.

Look how amazing the layering is. Notice the before & after — it didn’t just copy and paste the girl image onto the room image, like Photoshop would do.

It’s no longer sending prompts to DALL-E behind-the-scenes — it understands the images at a deep level.

Notice how the 3d angle and lighting in the after image is slightly different — it knows it’s the same room. And the same thing for the girl image.

These are not just a bunch of pixels or a simple internal text representation to GPT-4o. It “understands” what it’s seeing.

So of course refining images is going to so much more accurate and precise now.

The prompt adherence and creativity is insane.

What are the odds that something even remotely close to this was in the training data?

It’s not just spitting out something it’s seen before — not like it ever really was like some claimed. How much it understands your prompt has improved drastically.

And yes it can now draw a full glass of wine now.

Another huge huge upgrade is how insanely good it is at understanding & generating text now.

This edit right here is incredible on so many levels…

1. Understanding the images well enough to recreate them so accurately in a completely different image style with facial expressions.

2. It understand the entire context of the comic conversation well enough to create matching body language.
Notice how the 4th girl now has her left hand pointing — which matches the fact that she’s ordering something from the bar boy.
A gesture that arguably matches the situation even better than in the previous image.
And I bet it would be able to replicate her original hand placement if the prompt explicitly asked it to.

3. And then the text generation — this is something AI image generators have been struggling with since forever — and now see how easily GPT-4o recreated the text in the bubbles.

And not only that — notice how the last girl’s speech bubble now has an exclamation point — to perfect match her facial expression and this particular situation.

And yes it can integrate text directly into images too — perfect for posters & social media graphics.

If this isn’t a total disruptor in the realm of graphics design and photoshopping and everything to do with image creation, then you better tell me what is.

It’s really exciting and that’s why we’ve been seeing so many of this type of images flood the social media — images in the style of the creative studio, Ghibli.

And also part of why they’ve had to limit to only paid ChatGPT users with support for the free tier coming soon.

They’ve got to scale the technology and make sure everyone has a smooth experience.

All in all GPT-4o image gen is a major step forward that looks set to deal a major blow to traditional image editing & graphic design tool like Photoshop & illustrator.

Greater things are coming.

OpenAI’s new GPT-4o image generation is an absolute game changer

Leave a Comment Cancel Reply