ChatGPT’s Advanced Voice Mode was launched on Tuesday for select OpenAI subscribers as part of its highly anticipated alpha release. Originally announced in May, this feature revolutionizes interaction by moving beyond traditional text-based dialogue to engage users through natural, spoken language delivered with lifelike quality. It supports a range of regional accents and languages. According to OpenAI, Advanced Voice aims to provide “more natural, real-time conversations, allows interruptions at any time, and senses and responds to your emotions.”
However, there are some limitations when using Voice Mode. The system operates with four preset voices and cannot mimic the voices of individual users or public figures. Outputs that stray from these presets are automatically blocked. Additionally, Advanced Voice is not designed to create copyrighted audio or produce music. Interestingly, users have already experimented with it by asking the AI to beatbox.
Alpha tester Ethan Sutin shared a thread on X (formerly Twitter) showcasing various responses from Advanced Voice, including a short “birthday rap” and a beatboxing demonstration. Users can hear the AI’s digital breathing between beats. While it cannot create full songs, the AI impressively adds sound effects to bedtime stories, enhancing the storytelling experience. For example, it generates fitting crashes and slams during a tale about a rogue cyborg when prompted to create an immersive atmosphere.
Advanced Voice can also spontaneously generate realistic characters, enhancing its lifelike quality. Users can ask the AI to speak in various tones and languages, adding depth to their interactions.
The AI’s vocal capabilities extend beyond just human languages. For instance, when instructed, Advanced Voice can accurately mimic cat sounds. Users can engage the AI with questions about their furry companions, receiving tailored tips and advice in real time.
Moreover, Advanced Voice can utilize your device’s camera to support translation efforts. In one instance, a user pointed their phone at a Japanese-language GameBoy Advance Pokémon game, allowing the AI to read onscreen dialogue as they played. Although video and screen-sharing features are not part of the alpha release, OpenAI plans to introduce them soon. The company intends to expand the alpha release to more Plus subscribers in the coming weeks, with a full rollout scheduled for this fall.