OpenAI Launches Hyper-Realistic Voice Feature for Select ChatGPT Paying Users

Home AI News OpenAI Launches Hyper-Realistic Voice Feature for Select ChatGPT Paying Users

Updated on October 20 2024

OpenAI has begun the rollout of ChatGPT’s Advanced Voice Mode, providing users their first glimpse of GPT-4o's hyperrealistic audio capabilities. Currently available to a select group of ChatGPT Plus users, this alpha version is expected to gradually become accessible to all Plus subscribers by fall 2024.

When OpenAI first unveiled GPT-4o's voice in May, audiences were captivated by the rapid responses and its striking resemblance to a human voice—especially that of actress Scarlett Johansson, whose voice was likened to one of the demo voices named Sky. After the demonstration, Johansson turned down multiple requests from CEO Sam Altman to use her voice and engaged legal representation to protect her likeness. Although OpenAI denied using her voice, they subsequently removed it from the demo. In June, the company announced a delay in the release of Advanced Voice Mode to enhance safety measures.

Fast forward a month, and the wait for some features has ended (at least partially). OpenAI clarified that the video and screen-sharing functions showcased in their Spring Update will not be included in this alpha phase and will launch at a later time. While the impressive GPT-4o demo remains just that—a demo—certain premium users can now experience ChatGPT's voice feature firsthand.

ChatGPT Can Now Talk and Listen

If you have previously tried Voice Mode in ChatGPT, it’s important to note that OpenAI's Advanced Voice Mode operates differently. The earlier implementation relied on three distinct models: one for converting voice to text, GPT-4 for processing prompts, and another to transform ChatGPT's responses back into audio. In contrast, GPT-4o is a multimodal system that performs all these tasks seamlessly, resulting in notably lower latency during conversations. Additionally, OpenAI claims that GPT-4o can detect emotional tones in users' voices, including sadness, excitement, or even when singing.

During this pilot phase, ChatGPT Plus users will have the opportunity to experience the impressive capabilities of OpenAI’s Advanced Voice Mode.

OpenAI is gradually distributing ChatGPT’s new voice to monitor its usage closely. Participants in the alpha group will receive an alert through the ChatGPT app, followed by an email detailing how to utilize the feature.

Since its demo, OpenAI has tested GPT-4o's voice capabilities with over 100 external testers, representing 45 different languages. A report summarizing these safety efforts is expected to be released in early August.

The Advanced Voice Mode will feature only four preset voices—Juniper, Breeze, Cove, and Ember—developed in collaboration with professional voice actors. The Sky voice from the May demo is no longer available. OpenAI spokesperson Lindsay McCallum confirmed, "ChatGPT cannot impersonate the voices of individuals, including public figures, and will block any output that does not conform to these preset voices."

In an effort to avoid deepfake controversies, OpenAI is staying vigilant. Earlier this year, AI startup ElevenLabs faced backlash when its voice cloning technology was used to impersonate President Biden, misleading primary voters in New Hampshire.

Furthermore, OpenAI has implemented new filters to prevent certain requests that aim to generate music or other copyrighted audio. Over the past year, AI companies have encountered legal challenges regarding copyright infringement, and audio models like GPT-4o open up a new frontier for potential complaints, particularly from record labels known for their litigation history, such as those that have already pursued legal action against AI song generators like Suno and Udio.

OpenAI Supports Senate Bills Poised to Transform America's AI Policy Landscape

Airtable Acquires Onboarding Startup Dopt to Boost AI Talent Acquisition

Most people like

Copyter

Unlock the potential of an AI text generator designed to produce a wide range of high-quality content. Whether you need engaging articles, captivating blog posts, or informative product descriptions, this tool enhances your writing process. Discover how this innovative technology can elevate your content creation and streamline your workflow.

AI text generation AI Content Generator

Png AI

Discover a free AI tool that effortlessly generates high-quality PNG images in an instant. With this innovative solution, you can create stunning visuals quickly and easily, perfect for enhancing your projects. Whether for personal use or professional design, elevate your creativity with this powerful resource today!

AI PNG generator Text to Image

AImReply

Enhance Your Email Productivity with Our AI Email Assistant Unlock the full potential of your inbox with our cutting-edge AI Email Assistant, designed to streamline your communication and boost your productivity. Transform the way you manage emails, respond efficiently, and prioritize your tasks effortlessly. Experience a smarter approach to emailing today!

AI email assistant AI Content Generator

Textbuddy AI

In today's fast-paced digital world, clear and concise communication is essential. Our AI text editor empowers you to improve your writing effortlessly, ensuring that your ideas are expressed with clarity and precision. Whether you're crafting an email, a report, or creative content, our advanced AI technology helps you refine your text, making it more engaging and impactful. Discover how our AI text editor can elevate your writing game and captivate your audience.

ai text editor Writing Assistants

Find AI tools in YBX