OpenAI's DevDay Unveils Realtime API and Exciting Innovations for AI App Developers

Home AI News OpenAI's DevDay Unveils Realtime API and Exciting Innovations for AI App Developers

Updated on October 20 2024

It’s been a hectic week for OpenAI, marked by executive departures and significant fundraising updates. However, the company is moving forward, seeking to engage developers in creating applications using its AI models at the 2024 DevDay. On Tuesday, OpenAI unveiled several new tools, including a public beta of its “Realtime API,” designed for building applications that deliver low-latency, AI-generated voice responses. While it's not exactly ChatGPT’s Advanced Voice Mode, it’s certainly a step in that direction.

In a press briefing before the event, OpenAI's Chief Product Officer, Kevin Weil, reassured attendees that the recent exits of Chief Technology Officer Mira Murati and Chief Research Officer Bob McGrew would not impede the company’s momentum.

“I want to emphasize that Bob and Mira have been incredible leaders. I’ve learned a lot from them, and their contributions have played a crucial role in our success,” said Weil. “We are committed to progressing without pause.”

As OpenAI experiences yet another executive shakeup—echoing the turmoil following last year’s DevDay—the company aims to demonstrate to developers that it remains the premier platform for AI application development. According to company leaders, over 3 million developers are currently leveraging OpenAI’s AI models, although the startup faces increasing competition in the space.

OpenAI revealed it has slashed costs for developers accessing its API by 99% over the past two years, a move likely driven by rivals like Meta and Google consistently lowering their prices.

One of the notable new features, the Realtime API, enables developers to create nearly real-time, speech-to-speech experiences in their applications using any of six unique voices provided by OpenAI. Unlike the voices available for ChatGPT, developers cannot utilize third-party voices to mitigate copyright issues. Notably, a voice resembling Scarlett Johansson's remains unavailable.

During the briefing, Romain Huet, OpenAI's Head of Developer Experience, demonstrated a trip-planning application built with the Realtime API. Users could communicate verbally with an AI assistant regarding an upcoming trip to London, receiving low-latency responses. The Realtime API also integrates various tools, allowing the app to mark restaurant locations on a map while responding to queries.

At one point, Huet showcased how the Realtime API could handle phone conversations to inquire about food orders for an event. Unlike Google Duo, OpenAI’s API doesn’t directly call restaurants or shops; however, it can connect with calling APIs like Twilio. Importantly, OpenAI has decided not to include automatic disclosures for its AI models when making calls, placing the onus on developers to integrate such disclosures, which may soon be required by California law.

In addition to its DevDay announcements, OpenAI introduced vision fine-tuning for its API, allowing developers to employ both images and text to refine their applications of GPT-4o. This enhancement aims to boost GPT-4o's performance in tasks requiring visual comprehension. Olivier Godement, OpenAI’s Head of Product API, stated that developers would be prohibited from uploading copyrighted images, violent imagery, or any content violating OpenAI’s safety guidelines.

OpenAI is striving to keep pace with its competitors in AI model licensing. Its new prompt caching feature parallels a recently launched capability by Anthropic, enabling developers to store frequently used contexts between API calls, thus reducing costs and enhancing latency. OpenAI claims developers can save 50% with this feature, while Anthropic offers a 90% discount.

Lastly, OpenAI is rolling out a model distillation feature, allowing developers to use larger AI models, such as o1-preview and GPT-4o, to optimize smaller models like GPT-4o mini. This method typically leads to cost savings, and it enables developers to improve the performance of smaller AI models. Alongside model distillation, OpenAI is launching a beta evaluation tool so developers can assess their fine-tuned models within the OpenAI API.

DevDay may generate more buzz for what it didn’t reveal. There was no update regarding the GPT Store, initially mentioned during last year’s DevDay. Reports indicate that OpenAI has been testing a revenue-sharing program with some popular GPT creators, but further details remain scarce.

Moreover, OpenAI clarified that no new AI models would be launched during this year’s DevDay. Developers eagerly anticipating the release of OpenAI o1 (beyond the preview or mini versions) or the startup’s video generation model, Sora, will need to exercise patience.

Anthropic Welcomes OpenAI Co-Founder Durk Kingma to Its Team

Pinterest Launches Generative AI Tools for Enhanced Product Imagery for Advertisers

Most people like

Mercor

Introducing our advanced AI platform designed specifically for the seamless sourcing, vetting, and payment of employees. Elevate your hiring process with cutting-edge technology that streamlines recruitment, ensuring you find the right talent efficiently and effortlessly. With our innovative solutions, managing employee payments has never been easier, allowing your business to thrive in a competitive landscape.

AI hiring platform AI Interview Assistant

Myloves.AI

Discover the exciting potential of a personalized AI companion. With our platform, you can craft a unique AI lover or girlfriend that reflects your preferences and desires. Whether you're seeking companionship or a deeper emotional connection, our customizable options allow you to design the perfect virtual partner tailored just for you. Experience the future of relationships today!

AI lover creation AI Girlfriend

SocialBu - Social Media Management and Automation

SocialBu is a powerful tool designed to streamline and automate your social media management across multiple platforms.

social media management AI Product Description Generator

Private GenAI Chatbots

In today’s interconnected world, the demand for effective communication across languages has surged. Custom Large Language Models (LLMs) and machine translation services are revolutionizing how we interpret and convey information globally. By leveraging advanced algorithms and tailored solutions, these technologies enhance the accuracy and fluency of translations, enabling businesses and individuals to connect seamlessly. Discover how custom LLMs can transform your translation needs, paving the way for a more inclusive and multilingual future.

Custom Large Language Models Writing Assistants

Find AI tools in YBX