OpenAI's DevDay Unveils Realtime API and Exciting Innovations for AI App Developers

It’s been a hectic week for OpenAI, marked by executive departures and significant fundraising updates. However, the company is moving forward, seeking to engage developers in creating applications using its AI models at the 2024 DevDay. On Tuesday, OpenAI unveiled several new tools, including a public beta of its “Realtime API,” designed for building applications that deliver low-latency, AI-generated voice responses. While it's not exactly ChatGPT’s Advanced Voice Mode, it’s certainly a step in that direction.

In a press briefing before the event, OpenAI's Chief Product Officer, Kevin Weil, reassured attendees that the recent exits of Chief Technology Officer Mira Murati and Chief Research Officer Bob McGrew would not impede the company’s momentum.

“I want to emphasize that Bob and Mira have been incredible leaders. I’ve learned a lot from them, and their contributions have played a crucial role in our success,” said Weil. “We are committed to progressing without pause.”

As OpenAI experiences yet another executive shakeup—echoing the turmoil following last year’s DevDay—the company aims to demonstrate to developers that it remains the premier platform for AI application development. According to company leaders, over 3 million developers are currently leveraging OpenAI’s AI models, although the startup faces increasing competition in the space.

OpenAI revealed it has slashed costs for developers accessing its API by 99% over the past two years, a move likely driven by rivals like Meta and Google consistently lowering their prices.

One of the notable new features, the Realtime API, enables developers to create nearly real-time, speech-to-speech experiences in their applications using any of six unique voices provided by OpenAI. Unlike the voices available for ChatGPT, developers cannot utilize third-party voices to mitigate copyright issues. Notably, a voice resembling Scarlett Johansson's remains unavailable.

During the briefing, Romain Huet, OpenAI's Head of Developer Experience, demonstrated a trip-planning application built with the Realtime API. Users could communicate verbally with an AI assistant regarding an upcoming trip to London, receiving low-latency responses. The Realtime API also integrates various tools, allowing the app to mark restaurant locations on a map while responding to queries.

At one point, Huet showcased how the Realtime API could handle phone conversations to inquire about food orders for an event. Unlike Google Duo, OpenAI’s API doesn’t directly call restaurants or shops; however, it can connect with calling APIs like Twilio. Importantly, OpenAI has decided not to include automatic disclosures for its AI models when making calls, placing the onus on developers to integrate such disclosures, which may soon be required by California law.

In addition to its DevDay announcements, OpenAI introduced vision fine-tuning for its API, allowing developers to employ both images and text to refine their applications of GPT-4o. This enhancement aims to boost GPT-4o's performance in tasks requiring visual comprehension. Olivier Godement, OpenAI’s Head of Product API, stated that developers would be prohibited from uploading copyrighted images, violent imagery, or any content violating OpenAI’s safety guidelines.

OpenAI is striving to keep pace with its competitors in AI model licensing. Its new prompt caching feature parallels a recently launched capability by Anthropic, enabling developers to store frequently used contexts between API calls, thus reducing costs and enhancing latency. OpenAI claims developers can save 50% with this feature, while Anthropic offers a 90% discount.

Lastly, OpenAI is rolling out a model distillation feature, allowing developers to use larger AI models, such as o1-preview and GPT-4o, to optimize smaller models like GPT-4o mini. This method typically leads to cost savings, and it enables developers to improve the performance of smaller AI models. Alongside model distillation, OpenAI is launching a beta evaluation tool so developers can assess their fine-tuned models within the OpenAI API.

DevDay may generate more buzz for what it didn’t reveal. There was no update regarding the GPT Store, initially mentioned during last year’s DevDay. Reports indicate that OpenAI has been testing a revenue-sharing program with some popular GPT creators, but further details remain scarce.

Moreover, OpenAI clarified that no new AI models would be launched during this year’s DevDay. Developers eagerly anticipating the release of OpenAI o1 (beyond the preview or mini versions) or the startup’s video generation model, Sora, will need to exercise patience.

Most people like

Find AI tools in YBX

Related Articles
Refresh Articles