OpenAI Unveils o1 Inference Model: Surpassing GPT-4 with Human PhD-Level Capabilities in Physics and Biology

Home Hardware OpenAI Unveils o1 Inference Model: Surpassing GPT-4 with Human PhD-Level Capabilities in Physics and Biology

Updated on September 13 2024

Recently, OpenAI introduced its latest "Strawberry" model, known as "o1." This new series of artificial intelligence models is designed to engage in deeper thinking before answering questions.

Compared to previous models focused on science, coding, and mathematics, the o1 model excels in complex reasoning tasks and tackling more challenging problems. Through enhanced training, the o1 series not only refines its thought processes but also explores various strategies and possesses self-correcting capabilities.

OpenAI's evaluations indicate that the o1 model performs at a PhD level in benchmark tests across physics, chemistry, and biology. In mathematics and coding, o1 stands out as well: during the International Mathematical Olympiad (IMO) qualifying exam, the previous GPT-4o model answered only 13% of questions correctly, whereas the o1 model achieved an impressive accuracy rate of 83%. Additionally, o1 ranked in the 89th percentile in Codeforces competitions, showcasing its exceptional coding talent.

While o1 does not yet have some practical features of ChatGPT, such as web browsing and file uploads, OpenAI highlights that o1 is particularly adept at solving intricate scientific and mathematical problems. Medical researchers can leverage o1 to analyze cell sequencing data, physicists can generate complex mathematical equations for quantum optics, and developers can utilize o1 to create multi-step workflows.

OpenAI also launched o1-mini, a faster and more cost-effective reasoning model, ideal for coding applications. o1-mini is priced 80% lower than o1, making it a budget-friendly option for scenarios that require reasoning without extensive world knowledge.

In terms of safety, OpenAI has introduced a new training approach that harnesses the reasoning capabilities of the o1 model to enhance compliance with safety and alignment standards. In jailbreak tests, o1-preview scored 84, compared to GPT-4o's 22, demonstrating significant improvements in maintaining safety.

Currently, the preview versions of o1 and o1-mini are available in ChatGPT (Plus and Team) and through the API, with plans to make o1-mini accessible to all ChatGPT free users in the future.

Free Release of Google Gemini Live for Android: Experience AI Voice Chat Online

OpenAI Plans $150 Billion Funding Drive to Spark a New Wave in the AI Industry

Most people like

SOM AI

66.1K

Introducing an AI-Powered Research Assistant for Effortless Thesis Writing: Experience a seamless writing journey with our intelligent assistant designed to minimize stress and enhance productivity. Whether you're exploring complex topics or organizing your ideas, our innovative tool streamlines the research process, helping you craft a compelling thesis with ease.

AI asisten penelitian AI Chatbot

Fibery

270K

Discover Fibery, a versatile customizable workspace solution that integrates connected databases, comprehensive reports, and advanced AI features for enhanced productivity.

workspace AI App Builder

Flair AI

532K

Flair is an innovative AI tool designed to streamline and enhance the customization of product photography. This powerful solution makes it easier for businesses to create stunning, tailored images that captivate their audience and boost their online presence.

AI design tool AI Ad Creative Assistant

Anki Decks

138.6K

Are you tired of spending hours making flashcards that don’t stick? Imagine being able to create engaging and effective flashcards in a fraction of the time. With our innovative techniques, you can boost your study efficiency and retain information better. Dive in and discover how to revolutionize your learning experience by creating flashcards 10 times faster than before!

anki ai AI Notes Assistant

Find AI tools in YBX