Recently, OpenAI introduced its latest "Strawberry" model, known as "o1." This new series of artificial intelligence models is designed to engage in deeper thinking before answering questions.
Compared to previous models focused on science, coding, and mathematics, the o1 model excels in complex reasoning tasks and tackling more challenging problems. Through enhanced training, the o1 series not only refines its thought processes but also explores various strategies and possesses self-correcting capabilities.
OpenAI's evaluations indicate that the o1 model performs at a PhD level in benchmark tests across physics, chemistry, and biology. In mathematics and coding, o1 stands out as well: during the International Mathematical Olympiad (IMO) qualifying exam, the previous GPT-4o model answered only 13% of questions correctly, whereas the o1 model achieved an impressive accuracy rate of 83%. Additionally, o1 ranked in the 89th percentile in Codeforces competitions, showcasing its exceptional coding talent.
While o1 does not yet have some practical features of ChatGPT, such as web browsing and file uploads, OpenAI highlights that o1 is particularly adept at solving intricate scientific and mathematical problems. Medical researchers can leverage o1 to analyze cell sequencing data, physicists can generate complex mathematical equations for quantum optics, and developers can utilize o1 to create multi-step workflows.
OpenAI also launched o1-mini, a faster and more cost-effective reasoning model, ideal for coding applications. o1-mini is priced 80% lower than o1, making it a budget-friendly option for scenarios that require reasoning without extensive world knowledge.
In terms of safety, OpenAI has introduced a new training approach that harnesses the reasoning capabilities of the o1 model to enhance compliance with safety and alignment standards. In jailbreak tests, o1-preview scored 84, compared to GPT-4o's 22, demonstrating significant improvements in maintaining safety.
Currently, the preview versions of o1 and o1-mini are available in ChatGPT (Plus and Team) and through the API, with plans to make o1-mini accessible to all ChatGPT free users in the future.