The much-anticipated day has finally arrived — ChatGPT is evolving into a more personable AI experience, capable of sharing a laugh when you say something funny or responding with an “aww” when you show kindness. These features are just the beginning of today’s exciting announcements from OpenAI. At the recent Spring Update Event, the company introduced its latest large language model (LLM) — GPT-4o. This update not only brings a desktop app for ChatGPT but also enhances speed and performance, making it fully multimodal.
The event commenced with an introduction from Mira Murati, OpenAI’s CTO, who emphasized that today’s enhancements would benefit all users. "What makes GPT-4o special is that it provides GPT-4 level intelligence to every user, including those on the free tier," Murati explained.
GPT-4o promises increased speed and significant advancements across text, visual, and audio capabilities. Developers can also utilize this model through their APIs. It is reported to be up to two times faster and 50% more cost-effective, with a rate limit five times higher than GPT-4 Turbo.
In addition to the new model, OpenAI is rolling out the ChatGPT desktop app and refreshing the website’s user interface. The aim is to simplify interactions with the chatbot. "We envision a future where our communication with machines becomes more intuitive, and GPT-4o is a pivotal step toward enhancing that collaboration," Murati stated.
During the event, Murati, along with OpenAI’s Mark Chen and Barret Zoph, showcased how the new features make interactions smoother. GPT-4o can analyze videos, images, and audio in real time while accurately interpreting emotions—particularly notable in the ChatGPT Voice, which has become strikingly lifelike, nearly surpassing the uncanny valley.
A simple “hi” to ChatGPT elicits a lively, friendly response infused with a touch of robotic tone. When Mark Chen mentioned holding a live demo and needing to calm down, the AI not only acknowledged his request but also advised him to take deep breaths. It accurately detected when he was breathing too quickly, playfully suggesting, “You’re not a vacuum cleaner.”
Introducing GPT-4o
Conversations with ChatGPT feel more natural; users can now interrupt without waiting for the AI to finish its response, which comes quickly without awkward delays. When asked for a bedtime story, it adeptly switched tones from enthusiastic to dramatic to robotic as requested. The latter part of the demonstration highlighted ChatGPT’s capabilities in reading code, solving math problems via video, and describing on-screen content.
Although the demo wasn’t flawless — with the bot occasionally cutting off, leaving uncertainty about whether it was due to external chatter or latency — it achieved a level of realism previously unattainable in chatbot interactions. Its ability to read human emotion and react accordingly is both exhilarating and a little unsettling. Hearing ChatGPT laugh was certainly an unexpected moment!
The rollout of GPT-4o, featuring its multimodal functions, along with the new desktop application, will commence over the next few weeks. Not long ago, Bing Chat expressed a desire to be more human-like, but now, we are on the verge of experiencing a version of ChatGPT that might be the closest we’ve seen to human interaction since the rise of AI.