Anthropic has announced the release of its latest models, Claude 3.5 Sonnet and Claude 3.5 Haiku. The upgraded Claude 3.5 Sonnet features significant improvements across various functionalities, especially in programming capabilities. Meanwhile, the Claude 3.5 Haiku model is designed as Anthropic's response to OpenAI's GPT-4o Mini and Google's Gemini 1.5 Flash. It maintains the same pricing as its predecessor while offering enhanced performance.
Improvements in Claude 3.5 Sonnet:
- SWE-bench Verification Score: Increased from 33.4% to 49.0%, setting a new industry benchmark.
- Retail Sector Performance: TAU-bench score improved from 62.6% to 69.2%.
- Aerospace Sector Performance: Increased from 36.0% to 46.0%.
- General Knowledge and Mastery: GPQA and MMLU Pro scores rose to 65% and 78%, surpassing those of Gemini 1.5 Pro.
The new Claude 3.5 Haiku model excels in various AI benchmark tests, outperforming its predecessor, Claude 3 Opus. On the SWE-bench Verified test, it achieved a score of 40.6%, exceeding both the original Claude 3.5 Sonnet and OpenAI’s GPT-4 Turbo. Initially, Claude 3.5 Haiku will be available in pure text format, with plans to support image formats in the future.
Anthropic highlights that the US Artificial Intelligence Safety Institute (US AISI) and the UK Artificial Intelligence Safety Institute (UK AISI) have conducted pre-deployment testing of the new Claude 3.5 Sonnet model, a collaboration stemming from an agreement signed earlier this year. In line with its responsible scaling policy, the updated Claude 3.5 Sonnet model adheres to ASL-2 standards.
The enhanced Claude 3.5 Sonnet is now available through the Anthropic API, Amazon Bedrock, and Google Cloud's Vertex AI at the same pricing as before. The new Claude 3.5 Haiku model is expected to launch later this month.
These latest Claude 3.5 models deliver superior performance at competitive prices, making them an appealing choice for developers and businesses seeking advanced language models for their AI applications.