OpenAI President Unveils First Image Created by GPT-4o

OpenAI’s president, Greg Brockman, recently shared what appears to be the first public image generated by the company’s new GPT-4o model on his X account.

The image features a person in a black T-shirt emblazoned with the OpenAI logo, writing on a blackboard. The text reads, “Transfer between Modalities. Suppose we directly model P (text, pixels, sound) with one big autoregressive transformer. What are the pros and cons?”

The GPT-4o model, launched on Monday, enhances the previous GPT-4 family (including GPT-4, GPT-4 Vision, and GPT-4 Turbo) by offering faster processing, reduced costs, and improved retention of information from diverse inputs, such as audio and visuals.

OpenAI's innovative approach in training GPT-4o with multimedia tokens eliminates the need to convert audio and visual data into text first. This allows the model to directly analyze and interpret these media formats, resulting in a more seamless and efficient operation compared to the earlier GPT-4 models, which relied on multiple interconnected models.

Comparing the new image to those generated by OpenAI's DALL-E 3—released in September 2023—highlights significant improvements in quality, photorealism, and text accuracy with the GPT-4o model.

Currently, the native image generation capabilities of GPT-4o are not publicly accessible. As Brockman noted in his post, “The team is working hard to bring those to the world.”

Most people like

Find AI tools in YBX

Related Articles
Refresh Articles