Anthropic Claims Its New AI Chatbot Models Outperform OpenAI's GPT-4

AI startup Anthropic, backed by Google and significant venture capital funding, today unveiled its latest iteration of GenAI technology, Claude 3. The company asserts that this advanced AI chatbot outperforms OpenAI's GPT-4 in various performance metrics.

Claude 3 encompasses a suite of models—Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus, with Opus being the most powerful among them. According to Anthropic, these models display “enhanced capabilities” in both analysis and forecasting, achieving superior results on specific benchmarks compared to other models, including ChatGPT, GPT-4, and Google’s Gemini 1.0 Ultra (though not Gemini 1.5 Pro).

A standout feature of Claude 3 is that it is Anthropic’s first multimodal GenAI, capable of analyzing both text and images—similar to certain variants of GPT-4 and Gemini. It can interpret photos, charts, graphs, and technical diagrams, utilizing information from PDFs, slideshows, and other document formats.

Going a step beyond its competitors, Claude 3 can analyze up to 20 images in a single request, allowing users to compare and contrast multiple images, as noted by Anthropic. However, there are restrictions on Claude 3’s image processing capabilities. The models are designed not to identify individuals, reflecting the company’s caution regarding ethical and legal issues. Additionally, Claude 3 struggles with “low-quality” images (below 200 pixels) and has difficulty with spatial reasoning tasks, such as reading analog clocks or accurately counting objects in images.

While Claude 3 excels at image analysis, it does not generate artwork, focusing solely on processing existing images—at least for now.

Whether handling text or images, Anthropic emphasizes that users can generally expect Claude 3 to better follow multi-step instructions, produce structured outputs in formats like JSON, and converse in various languages more effectively than its predecessors. The model is also designed to refuse irrelevant questions less frequently, thanks to its “more nuanced understanding of requests,” according to Anthropic. Future updates will allow the models to cite sources for their answers, enhancing user trust.

In Anthropic's support article, the company states, “Claude 3 generates more expressive and engaging responses,” and notes that it’s easier to guide compared to previous models. Users can expect to achieve their desired outcomes with shorter and more concise prompts.

One of the key improvements lies in Claude 3’s expanded context capabilities.

Context refers to the input data the model processes before generating its output. Models with limited context can forget recent conversation details, often leading to irrelevant responses. In contrast, larger context windows enhance a model’s understanding of narrative flow, allowing for more contextually rich replies.

Anthropic reports that Claude 3 initially supports a 200,000-token context window—equivalent to about 150,000 words—with select clients able to use a staggering 1 million-token context window (approximately 700,000 words). This is on par with Google’s latest GenAI model, Gemini 1.5 Pro, which also offers up to a million-token context window.

However, Claude 3's enhancements do not eliminate its limitations. The company acknowledges in a technical whitepaper that Claude 3 is not free from common GenAI challenges, such as bias and hallucinations (e.g., fabricating information). Notably, unlike some GenAI models, Claude 3 cannot perform web searches and can only respond based on data available prior to August 2023. While it supports multiple languages, its fluency in certain “low-resource” languages is not as strong as in English.

Nonetheless, Anthropic promises regular updates for Claude 3 in the coming months. “We believe that model intelligence is far from reaching its limits, and we are committed to releasing enhancements for the Claude 3 model family soon,” the company states in a blog post.

Opus and Sonnet are available now through the web, Anthropic’s dev console and API, Amazon’s Bedrock platform, and Google’s Vertex AI. Haiku is scheduled for release later this year.

Here’s the pricing breakdown:

- Opus: $15 per million input tokens, $75 per million output tokens

- Sonnet: $3 per million input tokens, $15 per million output tokens

- Haiku: $0.25 per million input tokens, $1.25 per million output tokens

So, what’s the bigger picture with Claude 3? As previously reported, Anthropic aims to develop a next-generation algorithm for “AI self-teaching.” This could facilitate the creation of virtual assistants capable of managing emails, conducting research, generating art, and producing books—some functionalities already hinted at by models like GPT-4 and others.

Furthermore, Anthropic plans to integrate features into Claude 3 that will enhance its core capabilities, including interactions with other systems, interactive coding, and advanced agent-centric functionalities.

This ambition parallels OpenAI’s efforts to develop software that automates complex tasks—like transferring data from documents to spreadsheets or processing expense reports. OpenAI has already rolled out an API that allows developers to build similar “agent-like experiences” into their applications, and it appears Anthropic is set on delivering comparable features.

Could an image generator from Anthropic be on the horizon? While it’s uncertain, such a move would raise questions, given the ongoing controversies surrounding image generation, particularly concerning copyright and bias. Google recently halted its image generator due to its insensitivity to historical context, and many image generator companies face legal challenges from artists alleging unauthorized use of their work for training purposes.

I am especially interested in the evolution of Anthropic's “constitutional AI” approach to training GenAI, which the company suggests simplifies understanding and managing its AI's behavior. This method aligns AI with human values, guiding models to respond to queries and complete tasks based on a straightforward set of principles. For instance, Claude 3 has incorporated a principle that encourages inclusivity and understanding for users with disabilities.

Regardless of its ultimate goals, Anthropic is clearly committed to long-term development. A pitch deck leaked last May indicated that the company is looking to raise up to $5 billion over the next year, which may well position it competitively against OpenAI. Having already secured $2 billion from Google and $4 billion from Amazon, along with significant backing from various investors, Anthropic is making strides towards achieving its ambitious fundraising goals.

Most people like

Find AI tools in YBX